Wednesday, October 17, 2007

Adelaide Geek Dinner Approaches

I announced earlier that I was planning a dinner get-together for developers in Adelaide. I have since finalised the time and venue and have established an initial guest list.

The dinner will be held on Saturday November 17th starting 6:30pm at Cafe Buongiorno in the city. The cafe is at 187 Rundle Street, near the corner of Pulteney Street and the entrance to U-Park (for convenient parking). They serve familiar Italian meals - pizza and pasta starting at around $15.

I have found and contacted several local developer-bloggers and asked them to attend, some have committed, some have tentatively accepted but the response has been positive. If you are, or know, a local developer-blogger who hasn't been personally invited, send me an email and I'll add you to the guest list and ensure I book a large enough table at the cafe.

I look forward to seeing you there.

 Tuesday, October 16, 2007

PowerShell Resources

Ever since I heard of the concept of PowerShell (or Monad as it was known then) I was excited. Now that is been RTM for some time and I have had an opportunity to work with it in a production environment I love it even more.

While PowerShell could be summarised as a cross between a *nix shell and the .NET Framework there is still a lot unique to PowerShell alone and learning how it works and finding efficient tools to work with it is still necessary to make the most of PowerShell.

To learn PowerShell I purchased Bruce Payette's book Windows PowerShell In Action. It is a good-size detailed book at over 500 pages and receiving both the soft-cover book and the searchable PDF was excellent value. Pretty much every aspect of PowerShell is described including why certain design decisions were made. My only issue with the book, and it's not a big issue, is that I was itching to write some PowerShell scripts but script files aren't explained until Chapter 8 and security for script files isn't fully explained until Chapter 13 (the last chapter).

When I started to write my scripts I had the PowerShell console open on one monitor and Notepad open on the other. I would try certain commands in the console window and when they worked and gave the results I wanted I would copy them to the script in Notepad, save, and switch back to the console window to test the script. I still pretty much work like this today but I've replaced Notepad.

Considering I spend most of my time in Visual Studio I really wanted the same Intellisense and Syntax Highlighting experience when writing PowerShell scripts. The first PowerShell "IDE" I encountered was PowerShell Analyzer but I felt overwhelmed by the UI given that all I wanted was to edit .ps1 files. It seems very capable but just didn't feel right. More recently I have tried PowerGUI and it is very close to "Notepad for PowerShell". I have used it to develop the scripts for the last two PowerShell posts and I recommend it.

Do you have any resources you feel have been invaluable for getting the most out of PowerShell?

 Sunday, October 14, 2007

Report Services Automation With PowerShell

In late September Paul Stovell wrote about a set of VB.NET scripts he prepared to help deploy reports to SQL Server Reporting Services. If you've ever had the displeasure of deploying SSRS reports without Visual Studio then you'll understand how much it sucks.

Paul went to the effort to write individual scripts for creating folders and data sources on the server and uploading report definitions and configuring permissions. With Paul's work simple command scripts can then be used deploy reports.

However these command scripts still need to be written and they end up containing much of the same information as can be found in the .rptproj project file and the .rds data source files. I despise the idea of maintaining any sort of configuration information in more than one place so adding to the deploy command script whenever I add a report to the project in Visual Studio just makes me cringe.

Additionally, as Paul briefly mentions, MSBuild (and therefore Team Build) does not support Report Services projects so, once again, to deploy your reports as part of Continuous Integration you need to have separate tools.

Today I constructed a lengthy PowerShell script to take a Report Services .rptproj project file and output a command script that utilises Paul's VB.NET scripts to deploy the reports as per the project settings. Due to the size of the script rather than publishing it inline, you can download it here.

The script accepts three parameters. ProjectFile is the path to .rptproj file for the reports you want to deploy. If you omit this parameter the script uses the first report project file it finds in the current directory. The second parameter, ConfigurationName tells the script which project configuration to use for the target server URL and destination folders. If you omit this parameter the script uses the first configuration defined in the project. The last parameter SearchPaths is a list of paths for the script to search when locating both rs.exe and Paul's .rss files. The SearchPaths parameter is automatically combined with the environment PATH variable and may be omitted.

Here is an example usage:

PS C:\Users\Jason\Dev\MyReports> .\Deploy-SqlReports.ps1 `
    -ProjectName MyReports.rptproj `
    -ConfigurationName Release `
    -SearchPaths "C:\Tools\Report Services\" `
    | Out-File deploy.cmd -Encoding ASCII;

As always, my PowerShell skills are slowly improving and this script is not necessarily perfect in either robustness or efficient use of PowerShell. Hopefully it will be as useful to you as it has been to me and any changes you need should be easily made. Please leave a comment with your thoughts and suggestions.

 Saturday, October 13, 2007

Find Duplicate Files With PowerShell

I have pieced together a simple PowerShell script to recursively locate all duplicate files (by content, not name) below a chosen directory. It is not the most elegant code but for my purposes it works and hopefully you will be able to tweak it to suit your needs.

Firstly, it filters out any zero-length files. Zero-length files are naturally duplicates of each other and can be found quite trivially without my script. Secondly it groups all files by their length because if the length doesn't match, they can't have the same content. The script then excludes the length-groups with only one entry and calculates the MD5 hash of the remaining files. Groups of files with both matching size and hash are then returned in the results.

The hashing function was taken from the Duplicate Files post on the Windows PowerShell team blog. It simply uses the .NET cryptography namespace to compute the hash. From here you could easily exchange the MD5 algorithm for SHA1 or any other preferred algorithm.

Due to the need to read the entire contents of potentially matching files to compute the hash this can cause the script to take a long time against larger files. Executing the script against deep directory structures with many files will take longer too. The script could be easily modified to take a filtered input of files to only find, for example, duplicate photos.

Here is the script:

param ([string] $Path = (Get-Location))

function Get-MD5([System.IO.FileInfo] $file = $(throw 'Usage: Get-MD5 [System.IO.FileInfo]'))
{
    # This Get-MD5 function sourced from:
    # http://blogs.msdn.com/powershell/archive/2006/04/25/583225.aspx
    $stream = $null;
    $cryptoServiceProvider = [System.Security.Cryptography.MD5CryptoServiceProvider];
    $hashAlgorithm = new-object $cryptoServiceProvider
    $stream = $file.OpenRead();
    $hashByteArray = $hashAlgorithm.ComputeHash($stream);
    $stream.Close();

    ## We have to be sure that we close the file stream if any exceptions are thrown.
    trap
    {
        if ($stream -ne $null) { $stream.Close(); }
        break;
    }

    return [string]$hashByteArray;
}

$fileGroups = Get-ChildItem $Path -Recurse `
    | Where-Object { $_.Length -gt 0 } `
    | Group-Object Length `
    | Where-Object { $_.Count -gt 1 };

foreach ($fileGroup in $fileGroups)
{
    foreach ($file in $fileGroup.Group)
    {
        Add-Member NoteProperty ContentHash (Get-MD5 $file) -InputObject $file;
    }

    $fileGroup.Group `
        | Group-Object ContentHash `
        | Where-Object { $_.Count -gt 1 };
}

Once you have the output of the script you could use it delete the unnecessary files:

$dupes = Get-DuplicateItems;
$dupes | % { ($null, $rest) = $_.Group; $rest; } `
| Remove-Item -WhatIf;

As always, if you have any suggestions or improvements don't hesitate to leave a comment here.

BlogML Contribution

I was hit with trackback spam on my blog some time ago and decided the easiest way to stop it was to disable trackbacks on the site all together. The unfortunate downside to this is that I'm not automatically notified when someone blogs in response to one of my posts. I have to go looking.

Today I found a post by Doron Yaacoby written in August (sorry I took so long) about the Live Space support I added to the BlogML project. As I originally mentioned when I posted about this feature, I was too lazy to code a GUI for it. Thankfully, Doron has written one for us, and in WPF too!

It has been a while since I actually used my Live Space BlogML code to convert my own blog but, in response to Doron's note about comment support, from memory the API exposed by the Live Spaces website for querying blog data does not expose comments. Maybe they've changed the API since. Maybe there's a clever way to screen-scrape it.

Thanks for contributing Doron, and thanks for your kind words about my code too.