Click an Ad

If you find this blog helpful, please support me by clicking an ad!

Monday, February 17, 2014

Getting Your Email out of the Barracuda Message Archiver

We run a Barracuda Message Archiver 450. I really like the device, but we were looking at alternatives, and I needed a way to test possible solutions with real mail. The question that came to pass was, "How do we get our email out of the Barracuda?"

Basically, there is no out-of-the-box solution to this; Barracuda does not have a tool.

So, I wrote my own using Powershell. :)

I have to say that this code could stand to be cleaned up. I had to use some pretty circuitous methods to get it to work correctly.

Prior to running this:
1. You need to copy all of the files from the SMB share on the Barracuda Message Archiver somewhere else. I mapped this as U drive.
2. You need a working folder with gobs of space. I mapped this as my V drive.
3. Install 7zip

Basically, you have a bunch of .zip files. You extract everything out of these. The extracted files will have no extensions. What I got from support was that these files are either emails themselves, or are gzipped archives. I would use 7zip to try to decompress these files, and if the process returned an exit code of 2, I knew it wasn't a valid archive and would then append the .eml extension. If the file was a zip archive, the files unzipped would have the eml extension tacked onto the end.

I use a random number to create output folders to hold all of the many eml files. Some zips had upwards of 35,000 emails in them.

I KNOW I could have done a better job commenting this code. I'm almost embarassed to put it out here, but I really wished someone had given me some direction, so here it is. If you have the need for this script, you can create a copy of your archives and work through the code a chunk at a time to see what's going on, so that you don't put your production archives at risk. Remember that you can open .eml files with notepad. :)

I will use commented lines within the script for the remainder of this article.
The Script:

#Specify source and working folders, as well as report file variables
$Source = "U:\1"
$WorkingFolder = "V:\Extract"
$ReportFile = "C:\temp\MailArchReport.txt"
$ReportFileSpacer = "`r`n`r`n===========================================================================`r`n`r`n"

#Ask for starting file number and ending file number
[int]$StartingZip = Read-Host "Enter Number of First Zip File to Process"
[int]$EndingZip = Read-Host "Enter Number of Last Zip File to Process"

#Last chance to get out
$LastChanceAnswer = Read-Host "Are you sure you want to continue processing all files between $StartingZip.zip and $EndingZip.zip? (y or n)"
If ($LastChanceAnswer -ne "y"){
Break
} #End If

#Initialize the array to hold all expected zip file names
$ZipFileSet = @()

#Initialize Report File with the starting date and time
Get-Date | Add-Content $ReportFile

#Counter to populate the zip file names array
For ($i = $StartingZip; $i -le $EndingZip; $i++){
$StartipZipStr = "$i.zip"
$ZipFileSet = $ZipFileSet + $StartipZipStr
} #End For

#Add record for zip files processed to the report file
$ReportFileSpacer | Add-Content $ReportFile
"Files Processed:" | Add-Content $ReportFile
$ZipFileSet | %{Add-Content $ReportFile -Value $_}

#Go through the zip file names array and copy the files from the source to the working folder
Foreach ($file in $ZipFileSet){
copy-item "$Source\$file" -destination "$WorkingFolder"
} #End Foreach

#Create the first working folder
$WorkingFolderOneName = "$WorkingFolder\WorkingFolder1"
mkdir $WorkingFolderOneName | out-null

#Unzip all of the zip files in the array
Foreach ($file in $ZipFileSet){
$sourcefile = "$WorkingFolder\$file"
$targetfolder = "$WorkingFolderOneName"
$ZipCommandStringPartOne = 'C:\"Program Files"\7-zip\7z.exe'
$ZipCommandStringPartTwo = "x $sourcefile -o$targetfolder -r"
cmd.exe /C "$ZipCommandStringPartOne $ZipCommandStringPartTwo" | out-null
} #End Foreach

#Get a list of all files
$WeirdZipFiles = Get-ChildItem $TargetFolder -recurse | where {! $_.psiscontainer -and $_.fullname -notlike "*.???"} | select fullname, name, directory

#Add record for number of files
$ReportFileSpacer | Add-Content $ReportFile
"New zip files that don't have a zip extension:" | Add-Content $ReportFile
$WeirdZipFiles | measure-object | select count | %{$_.count | out-string} | Add-Content $ReportFile

#Initialize counters
$MovedCount = 0
$MovedRenamedCount = 0

#Create the folder for the emails
$RandomSeedForEMLFolder = Get-Random
$WorkingFolderTwoName = "$WorkingFolder\Emails_$RandomSeedForEMLFolder"
mkdir $WorkingFolderTwoName | out-null

#Each of those need to be unzipped.
Foreach ($file in $WeirdZipFiles){
$sourcefile = $File.Fullname
$targetfolder = $File.Directory.Fullname
$ZipCommandStringPartOne = 'C:\"Program Files"\7-zip\7z.exe'
$ZipCommandStringPartTwo = "x $sourcefile -o$targetfolder -r"
cmd.exe /C "$ZipCommandStringPartOne $ZipCommandStringPartTwo" | out-null
If ($LastExitCode -eq 2){ #If the file wasn't an archive, output the name.
$RandomSeed = Get-Random
$FileName = $File.Name
$FilePath = $File.directory.fullname
$OldFileFullname = ($FilePath + "\" + $FileName)
$FileNameAddition = "$RandomSeed.eml"
$NewFileName = ($FileName + $FileNameAddition)
$NewFileFullname = ($FilePath + "\" + $NewFileName)
Rename-Item $OldFileFullname -NewName $NewFileName
Move-Item $NewFileFullname $WorkingFolderTwoName
$MovedRenamedCount++
} #End If
If ($LastExitCode -eq 0){ #Otherwise, Rename, then move the raw eml file to working folder two
$FileNameSplit = $File.Name.split(".")
$ResultFileName = $FileNameSplit[0]
$FilePath = $File.directory.fullname
$OldFileFullname = ($FilePath + "\" + $ResultFileName)
$RandomSeed = Get-Random
$FileNameAddition = "$RandomSeed.eml"
$NewFileName = ($ResultFileName + $FileNameAddition)
$FilePath = $file.directory.fullname
$NewFileFullname = ($FilePath + "\" + $NewFileName)
Rename-Item $OldFileFullname -NewName $NewFileName
Move-Item $NewFileFullname $WorkingFolderTwoName
Remove-Item $SourceFile
$MovedCount++
} #End If
} #End Foreach

#Report Stuff
"`r`n Files that were renamed, then moved (.eml files): $MovedCount" | Add-Content $ReportFile
"`r`n Files that were extracted, then moved. $MovedCount" | Add-Content $ReportFile

#Remove working folder one
Remove-Item $WorkingFolderOneName -recurse -force

#Remove the zip files that were processed
Foreach ($file in $ZipFileSet){
$ZipFileSetPath = ($WorkingFolder + "\" + $file)
Remove-Item $ZipFileSetPath -recurse -force
} #End Foreach

#Add an ending timestamp to the report file
Get-Date | Add-Content $ReportFile

#Email the report file
Send-MailMessage `
-To me@contoso.com `
-From administrator@contoso.com `
-SMTPServer mail.contoso.com `
-Subject "Barracuda Zips Processed" `
-Body "See Attached Report" `
-Attachments $ReportFile

Remove-Item $ReportFile -force

15 comments:

  1. Thanks for the handy information, I've wondered how the mail was stored on my BMA.

    ReplyDelete
  2. Have you tried running this from multiple machines to speed up the process?

    ReplyDelete
    Replies
    1. No, actually we're switching to a different solution that allows the import of PST files, so I ended up not using this process. I learned a lot from developing it, though, so I wouldn't call it a waste of time!

      Delete
  3. THANK YOU! This made it so much simpler to export from a Barracuda 150 to our new Piler mail archive server.

    ReplyDelete
  4. I have a Barracuda message archive 650 I had to extract out of and after connecting to the bma-smb share, it looks like mine is made up of tons of individual files that can either be eml or compressed files across a huge subfolder tree. Thanks for providing me a starting point. I've uploaded my script here for other's that have my use case:

    BarracudaExportEml.ps1

    ReplyDelete
  5. I thank you for this script so much. But I am getting errors and am curious if you had the same issue
    I have an Archiver 350. I tested on just one of the ZIP archives. It extraced 22,722 emails fine but failed on 1134 archives with this error

    cmd.exe : ERROR: D:\BMA_Target\WorkingFolder1\1287\b
    \58\0d5dde3ae7608ad67ebed7d07b7d3.0_3842
    At H:\Grange\PowerShell\BMA_EML_Conversion.ps1:76 char:10
    + cmd.exe <<<< /C "$ZipCommandStringPartOne $ZipCommandStringPartTwo" |
    out-null
    + CategoryInfo : NotSpecified: (ERROR: D:
    \BMA_T...7d07b7d3.0_3842:String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

    Can not open the file as archive

    This is just from one of the archives of course.
    Any ideas? I have tried to open these archives manually with 7Ziip but still fails.
    Or are these archives considered corrupted?

    ReplyDelete
  6. I'm going to go with bad archive due to the second to last sentence.

    ReplyDelete
  7. I'm going to go with bad archive due to the second to last sentence.

    ReplyDelete
  8. OK. I thought as much myself. but that is a lot of emails to lose so I will talk to Barracuda today just to be sure. Thanks
    Again, great script

    ReplyDelete
  9. I've gotten the same error. I've browsed to the file in question and added .eml as the extension and I was able to open the file. Unfortunately, I've 2669 of these out of one of the Zips.

    ReplyDelete
  10. Interesting script. We are in the same boat. I have a customer that would like to switch to a new archive solution and wants to get their emails from the BMA. Question for you: If we export all of these zip/eml files, does it retain the folder structures for each user so that they can go back in their mailbox/archive on the new solution?

    ReplyDelete
  11. I don't know, but I do not think so. I was just extracting the raw .eml files. I imagine the structural data would be in their (proprietary) databases. In my case, I never gave my users the client or any logins - if they wanted something they called IT.

    ReplyDelete
  12. I redid this script in Python since I'm more familiar with itthan PowerShell. There is a benefit in that by using Python's email module I can parse every extracted EML file and check its validity. It would be fairly easy to extend it to extract the headers from every email (including ones which identify the original mailbox and even the folder from where it was archived) and insert the messages to a SQL database table with extra fields for the headers, and then one could search the archive using SQL queries.

    ReplyDelete
  13. SO are you going to share the Python script ?

    ReplyDelete