Select and then Sort
When you use both select-object and sort-object in a pipeline, what's the proper order? Let's check how fast they execute.
Each speed test is built from two similar commands with a different sort/select piping order and each test is executed 10 times. The total execution time is measured in Milliseconds.
- Updated: 06/11/2008 (see comment below by Lee Holmes) -
Test #1
PS > (measure-Command { 1..10 | foreach { gsv | sort name | select name,status }}).TotalMilliseconds 444.5451 PS > (measure-Command { 1..10 | foreach { gsv | select name,status | sort name}}).TotalMilliseconds 384.7565
Result: Second command is 15% faster.
Test #2
# this command is the third example of select-object command from the help files.
PS > (measure-command { 1..10 | foreach { gps | sort ws | select -last 5 }}).TotalMilliseconds 346.0652 PS > (measure-command { 1..10 | foreach { gps | select -last 5 | sort WS }}).TotalMilliseconds 100.4444Result: Second command is 3.44 times faster.
# this command is the sixth example of sort-object command from the CTP help files. I changed the extension to ps1.
PS > (measure-Command { 1..10 | foreach { dir *.ps1 | sort @{Expression={$_.LastWriteTime-$_.CreationTime}; Ascending=$ false} | select LastWriteTime, CreationTime}}).TotalMilliseconds
640.9769
PS > (measure-Command { 1..10 | foreach { dir *.ps1 | select LastWriteTime, CreationTime | sort @{Expression={$_.LastWr iteTime-$_.CreationTime}; Ascending=$false}}}).TotalMilliseconds
592.3512
Result: Second command is 8.2% faster
Test #3
PS > (measure-Command { 1..10 | foreach { dir | sort -unique | select name}}).TotalMilliseconds 6405.653 PS > (measure-Command { 1..10 | foreach { dir | select name | sort -unique}}).TotalMilliseconds 750.8251
Result: Second command is 753% faster!
The reason why 'select then sort' is faster to execute is because there are much less properties for sort-object to work on. When you select certain properties from a collection, select-object creates a new object with just the specified properties of the incoming object thus resulting in a smaller object to process.
One thing is for sure: In most cases, select objects before sorting them, and ALWAYS make sure they produce the SAME output!.
2 comments:
Be aware that the order of the commands makes a big difference in functionality, though.
Test #1 -- that is a valid performance trick. I would note that it is 15% faster, not 115% (1.15 times) faster.
Test #2 -- This introduces a bug, as you are only getting the working set from 5 random processes, rather than actually getting the 5 processes with the largest working set.
Test #3 -- This has a good chance of introducing a bug on objects where "Name" isn't a valid key to sort by. For example, imagine sorting DateTime objects. They have a built-in comparison function that PowerShell uses (which sorts by Ticks.) If you instead sorted by a property (such as the the string representation,) the two tests would produce radically different results.
Lee,
Thank you for the comprehensive comment. I've updated the post.
Test #2 example slipped under the radar :)
I can safely say that in most cases selecting objects (in case both return the same output) before sorting them is much more faster.
-Shay
Post a Comment