Migrating from NetApp to Windows File Servers with PowerShell – part 2

Previously we saw how PowerShell and RoboCopy can be used to sync multi-terabyte file shares from NetApp to Windows. What I did not tell you was that this script choked and died horribly on a single share in our infrastructure. You may have seen it commented out in the previous script? “#,’R25′”?

CollegeNet Resource25… my old enemy. These clowns worked around a bug in their product (an inability to read an open text column in an Oracle DB table) by copying every text row in the database to its own file on a file server, and to make matters worse they copy all of the files to the same directory. Why is this bad? Ever try to get a directory listing on a directory with 480,000 1k files? It’s bad news. Worse, it kills robocopy. Fortunately, we have a workaround.

The archive utility “7-zip” is able to wrap up the nasty directory into a single small file, which we then can unpack on the new file server. Not familiar with 7-Zip? For shame! Get it now, it’s free:

7-zip ignores most file attributes, which seems to speed up the copy process a bit. Using robocopy, ouy sync operation would either run for hours on this single directory, or just hang up forever. With 7-zip, we get the job done in 30 minutes. Still slow, but better than never.

Troublesome files are found in the R25 “text_comments” directory, a subdirectory of “text”. We have prod, pre-prod, and test environments, and so need to do a few separate 7-zip archives. Note that a little compresson goes a long way here. When using “tar” archives, my archive was several gb in size. With the lowest level of compression, we squeeze down to only about 14 Mb. How is this possible? Well, a lot of our text comment files were empty, but uncompressed they still take up one block of storage. Over 480,000 blocks, this really adds up.

Code snippet follows.

#Sync R25 problem dirs

Set-PSDebug -Strict

# Initialize the log file:
[string] $logfile = "s:\r25Sync.log"
remove-item $logfile -Force
[datetime] $startTime = Get-Date
[string] "Start Time: " + $startTime | Out-File $logfile -Append

function zipit {
	param ([string]$source)
	[string] $cmd = "c:\local\bin\7za.exe"
	[string] $arg1 = "a" #add (to archive) mode
	[string] $arg2 = join-path -Path $Env:TEMP -ChildPath $($($source | `
		Split-Path -Leaf) + ".7z") # filespec for archive
	[string] $arg3 = $source #spec for source directory
	[string] $arg4 = "-mx=1" #compression level... minimal for performance
	#[string] $arg4 = "-mtm=on" #timestamp preservation - commented out for perf.
	#[string] $arg5 = "-mtc=on"
	#[string] $arg6 = "-mta=on"
	#invoke command, route output to null for performance.
	& $cmd $arg1,$arg2,$arg3,$arg4 > $null 

function unzipit {
	param ([string]$dest)
	[string] $cmd = "c:\local\bin\7za.exe"
	[string] $arg1 = "x" #extract archive mode
	[string] $arg2 = join-path -Path $Env:TEMP -ChildPath $($($dest | `
		Split-Path -Leaf) + ".7z")
	[string] $arg3 = "-aoa" #overwrite existing files
	#destination directory specification:
	[string] $arg4 = '-o"' + $(split-path -Parent $dest) + '"' 
	#invoke command, route to null for performance:
	& $cmd $arg1,$arg2,$arg3,$arg4 > $null 
	Remove-Item $arg2 -Force # delete archive

[String[]] $zips = "V3.3","V3.3.1","PROD\WinXp\Text"
[string] $sourceD = "\\files\r25"
[string] $destD = "s:\r25"

foreach ($zip in $zips) {
	Get-Date | Out-File $logfile -Append 
	[string] "Compressing directory: " + $zip | Out-File $logfile -Append 
	zipIt $(join-path -Path $sourceD -ChildPath $zip)
	Get-Date | Out-File $logfile -Append 
	[string] "Uncompressing to:" + $destD | Out-File $logfile -Append
	unzipit $(Join-Path -Path $destD -ChildPath $zip)

Get-Date | Out-File $logfile -Append 
[string] "Syncing remaining files using Robocopy..." | Out-File $logfile -Append
$xd1 = "\\files\r25\V3.3" 
$xd2 = "\\files\r25\V3.3.1" 
$xd3 = "\\files\r25\PROD\WinXP\text"
$xd4 = "\\files\r25\~snapshot"
$roboArgs = @("/e","/copy:datso","/purge","/nfl","/ndl","/np","/r:0","/mt:4",`

& robocopy.exe $roboArgs

Get-Date | Out-File $logfile -Append 
[string] "Done with Robocopy..." | Out-File $logfile -Append

# Complete logging:
[datetime] $endTime = Get-Date
[string] "End Time: " + $endTime | Out-File $logfile -Append 
$elapsedTime = $endTime - $startTime
[string] $outstr =  "Elapsed Time: " + [math]::floor($elapsedTime.TotalHours)`
	+ " hours, " + $elapsedTime.minutes + " minutes, " + $elapsedTime.seconds`
	+ " seconds."
$outstr | out-file -Append $logfile