Photoshop scratch, system swap, and solid state

The first rule of Pho­to­shop club is: don’t let Pho­to­shop use the sys­tem disk as a scratch disk. It’s a good rule of thumb and it has helped many peo­ple get much bet­ter per­for­mance from their Pho­to­shop sys­tems. It’s helped even more peo­ple think they were get­ting bet­ter per­fo­mance from their Pho­to­shop systems.

The prin­ci­ple is straight­for­ward: Pho­to­shop is prone to using more mem­o­ry than the sys­tems on which they run have. When mod­ern oper­at­ing sys­tems run out of mem­o­ry, they begin using hard disk space as though it were mem­o­ry. Hard disks are sev­er­al hun­dred thou­sand times slow­er than RAM when it comes to mov­ing infor­ma­tion, so this can take a huge per­for­mance toll on your com­put­er. If you have ever opened too many appli­ca­tions and been met with very poor respon­sive­ness on your com­put­er, as well as a lot of extra noise, that’s the hard disk work­ing over­time pre­tend­ing to be mem­o­ry. And it’s a good thing; in the old days when com­put­ers ran out of mem­o­ry, they’d just stop work­ing.

The oper­at­ing sys­tem is not the only one play­ing this game though. The authors of Pho­to­shop, know­ing that Pho­to­shop can devour mem­o­ry, have built rou­tines that do very much the same thing. Rather than ask­ing the oper­at­ing sys­tem for more mem­o­ry, Pho­to­shop starts cre­at­ing its own files on disk to hold the infor­ma­tion that it would rather have in RAM. It does this in a way that is opti­mized for its own use. Since Pho­to­shop knows which infor­ma­tion is most crit­i­cal to keep in fast mem­o­ry, it can cher­ryp­ick the best data to spool off to disk.

###Swap ver­sus scratch

Typ­i­cal­ly, an oper­at­ing sys­tem’s vir­tu­al mem­o­ry disk spool­ing is referred to as swap (think of it as swap­ping some disk space for an equal amount of mem­o­ry) while Pho­to­shop refers to this very sim­i­lar process as scratch. You can think of it like writ­ing down a phone num­ber on a piece of scratch paper so that you can for­get it and look back at the paper later.

Unlike the oper­at­ing sys­tem’s swap space, which is gen­er­al­ly not user-con­fig­urable, Pho­to­shop pro­vides user con­trol over how it han­dles scratch. One can assign the dri­ve to be used for Pho­to­shop scratch, and even assign mul­ti­ple dri­ves and the order in which they ought to be used.

A nat­ur­al rule one might there­fore apply to the assign­ment of scratch dri­ves there­fore, is to assign the dri­ves in order of how fast they are. That can get com­pli­cat­ed because there are a num­ber of vari­ables that go into the speed at which one can write or access data from a disk, but basi­cal­ly you want to use up the space on the fast dri­ve before start­ing to write to the slow­er disks.

###Can’t you see I’m busy read­ing and writing?

That’s not the end of it, how­ev­er. Hard disks can’t real­ly do two things at once, but they will try if you ask them to. Some­times the infor­ma­tion two process­es are look­ing for will be on very dif­fer­ent parts of the dri­ve, caus­ing the read and write heads to trav­el back and forth quite inef­fi­cient­ly, read­ing part of one file before mov­ing back to write part of anoth­er file and then back to the first. Hard dri­ve seek times are typ­i­cal­ly less than a hun­dredth of a sec­ond, but that can add up if the dri­ve has to go back and forth a lot. When the oper­at­ing sys­tem swaps mem­o­ry to disk or Pho­to­shop writes scratch data to disk, both want to do a lot of con­stant read­ing and writ­ing. When these two process­es are com­pet­ing for the hard disk, it can make an already slow process even slower.

This is why Pho­to­shop users typ­i­cal­ly pre­fer to instruct Pho­to­shop to ignore the sys­tem disk and only use some oth­er disk — any oth­er disk — for scratch. Read­ing sup­port forums for Pho­to­shop, you’ll find this same advice doled out again and again: don’t use your sys­tem disk as a scratch disk.

Sad­ly, this often good advice is giv­en both blind­ly and insis­tent­ly. In peer sup­port forums such as http://forums.adobe.com/community/photoshop you can see users refus­ing to help oth­er users with per­for­mance tun­ing unless they first fol­low the gospel advice: don’t use your sys­tem disk as a scratch disk. It’s unfor­tu­nate, but some­times under­stand­able. When trou­bleshoot­ing, you want to elim­i­nate the obvi­ous caus­es of issues before mov­ing on to the esoteric.

The unfor­tu­nate part is this advice, while good in some cir­cum­stances, is by no means uni­ver­sal. Many users are work­ing at res­o­lu­tions intend­ed for the Web and will nev­er cause their sys­tems to do a mean­ing­ful amount of swap­ping or use of scratch disks. To this objec­tion comes the insis­tence: no, Pho­to­shop always cre­ates a scratch file. This is true, but if it isn’t done dur­ing time you’re wait­ing for the com­put­er to fin­ish its task, who cares? Many Pho­to­shop users real­ly don’t need to con­cern them­selves with exter­nal dri­ves to oper­ate Pho­to­shop any more than they need exter­nal hard dri­ves to oper­ate a word processor.

A poten­tial pit­fall is when the sec­ondary dri­ve is much slow­er than the sys­tem dri­ve. This can be for a vari­ety of rea­sons: hard dri­ve con­trollers inside the com­put­er for the most part are much faster than the exter­nal inter­faces that are avail­able. Thun­der­bolt sup­pos­ed­ly intro­duces some very high through­put rates, but a USB or Firewire dri­ve will be quite lim­it­ed in speed com­pared to the sys­tem dri­ve. There­fore there is a bal­anc­ing act: does the slow­ness of the exter­nal inter­face slow the process down more than the com­pe­ti­tion for the sys­tem dri­ve will? In most cas­es of course the answer is no. Using an exter­nal dri­ve as a scratch disk ought to be good for performance.

###What if the inter­nal dri­ve is very, very fast?

The ques­tion became a big one for me when I began plan­ning to pur­chase the work­sta­tion on which I make the files for my lux­o­graph­ic prints. I knew that I need­ed to get as much mem­o­ry as I could and then get the fastest hard dri­ves I could afford. In truth, I spent more than I could afford, even with the gen­er­ous help of a friend who works at Apple and who gave me her «friends and fam­i­ly» dis­count when I bought the machine. Not only did I get a Mac­in­tosh with as much mem­o­ry as was pos­si­ble in 2008 (32GB), I bought it with a hard­ware RAID card and four 15,000 RPM SAS drives.

Sort­ing out the alpha­bet soup: SAS is an inter­face that han­dles mul­ti­ple simul­ta­ne­ous requests bet­ter than the inter­face most com­put­ers come with. These rel­a­tive­ly expen­sive dri­ves spin at 15,000 rev­o­lu­tions per minute, which is about twice as fast as most desk­top hard dri­ves spin, and near­ly three times as fast as many lap­top hard dri­ves. Rota­tion speed is not the only fac­tor in read­ing and writ­ing data so it does­n’t mean that these dri­ves can move data two or three times faster than stan­dard dri­ves, but they are quite fast owing in part to the speed at which the plat­ters spin.

RAID con­fig­u­ra­tion can get quite com­plex, but the fea­ture I want­ed most was the addi­tion­al speed that can be gained by writ­ing to mul­ti­ple dri­ves at the same time. The elec­tron­ics that con­trol the dri­ves can han­dle more data than can be phys­i­cal­ly writ­ten, so they can write to more than one dri­ve at a time and keep track of where each part is stored. One file will end up stored across mul­ti­ple disks, but it is an effec­tive way to increase the speed.

I rea­soned at the time that because all the data would be going through the same con­troller and all the disks would be writ­ten to and read from con­stant­ly that I would not need to set aside a sep­a­rate dri­ve as the con­ven­tion­al wis­dom dic­tat­ed. But I found myself up against the wall when it came time to ask con­fig­u­ra­tion ques­tions of the users that knew more than I did about Pho­to­shop. It did not mat­ter whether I had a RAID, they told me. If I want­ed any­thing oth­er than hor­ri­ble per­for­mance out of Pho­to­shop, I had to set aside a dri­ve oth­er than my sys­tem dri­ve for Pho­to­shop to use as scratch.

At the time I did not have the lux­u­ry of doing a lot of empir­i­cal test­ing, although I did enough test­ing to know that the con­fig­u­ra­tion I set­tled on (one of my four dri­ves for my sys­tem, the oth­er three in a RAID0 con­fig­u­ra­tion ded­i­cat­ed to scratch) was much faster than my pre­vi­ous com­put­er. I spent a lot of mon­ey on that sys­tem, but it let me get as much done in an hour as I’d pre­vi­ous­ly got­ten done in a day. It was a big upgrade.

Since then, I’ve often won­dered how true the blind asser­tion about sec­ond dri­ves is. The price of sol­id state dri­ves, which behave much more like RAM than hard dri­ves, has come down some­what dra­mat­i­cal­ly and the size of the avail­able dri­ves has increased to the point where they could be con­sid­ered a prac­ti­cal alter­na­tive to hard dri­ves with spin­ning plat­ters. What would hap­pen with fast SSD dri­ves, which don’t have read heads that have to trav­el from one side of a plat­ter to anoth­er, when mul­ti­ple sources tried to read and write at the same time?

Recent­ly I had the oppor­tu­ni­ty to do more of this test­ing, and the results are enlight­en­ing. Of course, not every­one does the sort of mem­o­ry-inten­sive Pho­to­shop work that I do, but as mem­o­ry short­ages are the pri­ma­ry cause of Pho­to­shop per­for­mance issues, doing basic trans­for­ma­tions on the enor­mous files I work with is a good test of how a par­tic­u­lar con­fig­u­ra­tion will per­form with regard to the use of scratch and swap.

###The gaunt­let

I tried a vari­ety of tests on my test machines and set­tled on just two, or real­ly just one. I record­ed the time for file open of a 3.5GB two-lay­er grayscale Pho­to­shop .psb (large doc­u­ment for­mat) file, and the time to rotate that file 18.5° clock­wise. I did not include the results of any fur­ther trans­for­ma­tions as the mem­o­ry require­ments exceed­ed the avail­able hard dri­ve space on two of the three test sys­tems when scratch was lim­it­ed to the sys­tem dri­ve. Open­ing and rotat­ing this file gen­er­at­ed a Pho­to­shop scratch file of about 110GB. Fur­ther trans­for­ma­tions on the con­fig­u­ra­tions that could han­dle the addi­tion­al load did­n’t yield any sur­pris­es, so I hope I’m not miss­ing out on too much by com­par­ing the one transformation.

It’s worth not­ing that Activ­i­ty Mon­i­tor was run dur­ing all these tests, and not once under any con­fig­u­ra­tion did CPU usage rise above 10%. This was not sup­posed to be a test of CPU pow­er, but a test of swap and scratch handling.

###The con­fig­u­ra­tions

Three machines were used for test­ing: a 2008 Mac Pro, a 2011 Mac­Book Air, and a 2011 iMac. The Mac Pro has two quad-core Xeon proces­sors run­ning at 2.8GHz, 32GB of 800MHz DDR2 RAM, an Apple RAID Pro card and four 300GB 15,000RPM SAS dri­ves. I changed the RAID con­fig­u­ra­tion around as will be not­ed lat­er, and at one point I reduced the mem­o­ry to 4GB to match the oth­er sys­tems, though only in one of the tests.

The iMac has a sin­gle quad-core i5 proces­sor run­ning at 3.1GHz, 4GB of 1333MHz DDR3 RAM, and a sin­gle 1TB 7200 RPM SATA hard drive.

The Mac­Book Air in some regards ought to be con­sid­ered the low­est-end of the test machines. It has a dual-core i7 adver­tised at 1.8GHz and 4GB of 1333MHz DDR3 RAM. Two things to note about these specs how­ev­er: first that the i7 is a vari­able-speed chip, and report­ed­ly runs as fast as 2.9GHz. Sec­ond, the video mem­o­ry for the Mac­Book Air is shared with the sys­tem mem­o­ry. There­fore it should be con­sid­ered to have close to 3.5GB of RAM in com­par­i­son to the iMac’s 4GB where the iMac has a graph­ics card with 1GB of RAM ded­i­cat­ed sole­ly to video. The main dif­fer­ence that takes the Mac­Book Air out of its «low-end» pigeon­hole is the hard dri­ve: it does not have one. Instead the Mac­Book Air has a 256GB sol­id-state dri­ve. As we will see, the sol­id state dri­ve makes a world of dif­fer­ence when it comes to performance.

Sys­tem System/swap PS Scratch Open 3.5GB file Rotate 18.5°
2008 Mac Pro (32GB) Sin­gle JBOD drive 3‑drive RAID5 1:37 8:00
Sin­gle JBOD drive 3‑drive RAID0 1:29 6:33
Sin­gle JBOD drive 2‑drive RAID0 1:31 6:49
Sin­gle JBOD drive Sec­ond JBOD drive 1:27 7:22
4‑drive RAID5 same as system 1:33 7:33
4‑drive RAID5 500GB Firewire 800 2:25 11:08
2008 Mac Pro (4GB) 4‑drive RAID5 same as system 5:22 14:41
2011 iMac 7200 RPM SATA same as system 4:36 41:00
7200 RPM SATA 500GB Firewire 800 5:06 57:05
2011 Mac­Book Air 256GB Apple SSD same as system 1:45 10:04
256GB Apple SSD 500GB USB2 drive 6:24 39:53

A few caveats: these tests were per­formed with­out any sci­en­tif­ic rig­or. These are all machines used for dai­ly pro­duc­tion, not spe­cial­ly-made test­beds. I did not attempt every pos­si­ble con­fig­u­ra­tion. For exam­ple, a 2‑drive RAID0 for sys­tem and a sec­ond 2‑drive RAID 0 for scratch, or a sin­gle 4‑drive RAID0 on the Mac Pro might be a good con­fig­u­ra­tion for per­for­mance, but the increased chance of sys­tem fail­ure due to dri­ve fail­ure is too great a risk for my pro­duc­tion machine. The sys­tem con­fig­u­ra­tions are dif­fer­ent enough that only the broad­est gen­er­al­iza­tions should be drawn.

Also, of these sys­tems at the time the tests were run, only the iMac had Pho­to­shop 12 (CS 5.1). The rest ran Pho­to­shop 11 (CS4). CS5 has since been installed on the Mac Pro and it has shown itself to be slight­ly faster with large files on the Mac Pro due to 64-bit mem­o­ry address­ing, but that gain is (dis­ap­point­ing­ly) less than 5%. It does not seem to be a big dif­fer­ence, but don’t draw too many con­clu­sions based on a com­par­i­son of any two sys­tems by themselves.

That said, some con­clu­sions seem clear. First, using a sec­ond dri­ve for Pho­to­shop swap only helps per­for­mance when the sec­ond dri­ve is as fast or faster than the sys­tem dri­ve. Unless your sys­tem dri­ve is a slow 5400RPM dri­ve, an exter­nal Firewire or USB dri­ve will actu­al­ly hurt per­for­mance. Fibre Chan­nel or Thun­der­bolt dri­ves should be able to help. A sec­ond inter­nal dri­ve if you have one or can install one can also help. But just plug­ging in a Firewire or USB2 dri­ve and set­ting it as your first Pho­to­shop scratch disk is like­ly to slow you down more than com­pe­ti­tion for your pri­ma­ry dri­ve will.

###Mem­o­ry is king

Check out the dif­fer­ence in speed on the Mac Pro con­fig­ured with all four dri­ves into RAID5 between 32GB of RAM and 4GB of RAM. It’s near­ly twice as fast with 32GB of RAM. Jug­gling the dri­ve con­fig­u­ra­tions made sig­nif­i­cant dif­fer­ences in per­for­mance, but boost­ing the mem­o­ry made the biggest difference.

###SSD is almost king

The Mac­Book Air with­out any exter­nal dri­ves ran cir­cles around the iMac. Now, the i7 chip in the Mac­Book Air has bet­ter mul­ti­thread­ing than the i5 chip in the iMac, but with a low­er max­i­mum clock speed and half the cores, the best case sce­nario for the Mac­Book Air is that its proces­sor could be almost as fast as the iMac’s. Between that and the low­er avail­able RAM, the only expla­na­tion for the Mac­Book Air fin­ish­ing four times faster than the iMac is the faster sol­id-state dri­ve. Note that when the Mac Pro was reduced to 4GB of RAM, it actu­al­ly fin­ished with an almost 50% greater time than the Mac­Book Air. Grant­ed, the Mac Pro was not test­ed in all the RAID con­fig­u­ra­tions at 4GB than it was at 32GB; it might have been use­ful to time the Mac Pro with the ded­i­cat­ed three-dri­ve RAID0 and only 4GB of RAM but it seems clear that the biggest per­for­mance dif­fer­ence there came from the mem­o­ry, not the RAID con­fig­u­ra­tion, and it is almost cer­tain that the sol­id state dri­ve out­per­forms the fastest RAID I could have put togeth­er with this hardware.

Sol­id state dri­ves are still pret­ty expen­sive, but then again, so are 15,000 RPM hard dri­ves. Add a few hun­dred dol­lars for a hard­ware RAID con­troller and the mul­ti­ple dri­ves that are need­ed for a RAID, and I no longer see any wis­dom in using spin­ning disks for a work­sta­tion. For serv­er sit­u­a­tions where redun­dan­cy is need­ed as a fail­safe against dri­ve fail­ure, hard­ware RAID still makes sense.

This is a prag­mat­ic ques­tion, so my hope here is not that any­one will use this post to bul­ly any­one over the ques­tion of Pho­to­shop per­for­mance tun­ing and con­fig­u­ra­tion. You will want to test and exper­i­ment your­self to try to get the best per­for­mance. Per­for­mance isn’t the only ques­tion either, even for some­one who works with ginor­mous files. After all this, I end­ed up leav­ing the Mac Pro with all four dri­ves in a RAID5 con­fig­u­ra­tion. It’s not the fastest of the test­ed options, but if I lose a dri­ve I’ll have time to get a replace­ment before los­ing all my data. That’s worth an extra forty-five sec­onds here and there.

4 Replies to “Photoshop scratch, system swap, and solid state”

  1. Not that great…

    Con­sid­er­ing 32GB is more than enough to com­plete the 3.5GB file rota­tion sole­ly in mem­o­ry only gives you twice the per­for­mance sug­gests some­thing is wrong with this test. If PS were actu­al­ly pro­cess­ing the file only in RAM, that 3.5GB rotate oper­a­tion would only take a few sec­onds. Maybe cache levels/history set­tings would change this much more, but it seems it has a self-defeat­ing method of hand­ing files when it has more than enough mem­o­ry to do it only in the memory.

     

    I cur­rent­ly own a note­book with 32GB mem­o­ry, and mak­ing a 20GB RAMdisk for PS cache to trick it into per­form­ing %100 of oper­a­tions in the mem­o­ry boosts an oper­a­tion like this sig­nif­i­cant­ly. You need to set the cache lev­els and his­to­ry fair­ly low if you’re going to edit a file that large with more than a few glob­al fil­ters, though.

  2. Ah crap, some­how I missed

    Ah crap, some­how I missed where you men­tioned the 120GB of scratch pho­to­shop ate to do the image rotation.

     

    So, now I wan­na know why it takes that much space just to rotate. lol. That’s crazy.

  3. So I cre­at­ed a 35000×35000

    So I cre­at­ed a 35000×35000 doc­u­ment, filled it with non-uni­form col­or noise, and rotat­ed it, and it took 53 sec­onds and gen­er­at­ed no excess scratch data. I can under­stand that my proces­sor is faster, but I don’t get at all where the mas­sive 120GB scratch use comes from. 53 sec­onds is only enough time for my scratch HDD to write 3.1GB at top speeds (60MB/sec x 53 seconds)

     

     

    For hav­ing 32GB of RAM methinks some­things wrong with your sys­tem since its gen­er­at­ing such a hor­rid­ly large scratch file.

    1. Big files

      Idiot me will have to find the file that I used to be sure, but I believe your tests are done with a much small­er file than I used. 35,000×35,000 would be 1.225MB, and even non-uni­form noise will com­press pret­ty well loss­less­ly. 3.5GB on disk could be a much big­ger file when opened.

      You did not say how much faster your proces­sor is than mine, but that would­n’t sur­prise me if it were a fac­tor. 2.8GHz isn’t too bad these days but it’s not top of the line, and new proces­sors at the same clock speed get about twice as much done in the same time. Also, run­ning with eight cores gives me almost no advan­tage in Pho­to­shop which is almost entire­ly sin­glethread­ed (necce­sar­i­ly so due to the nature of the trans­for­ma­tions it does — Adobe has done a pret­ty good job of uti­liz­ing mul­ti­ple proces­sors when pos­si­ble.) Fur­ther­more, a more mod­ern sys­tem will have DDR3 1333MHz (or if you’re real­ly 1337 up to 19.2GHz quad-chan­nel) RAM instead of my DDR2-800.

      Your sug­ges­tion of cre­at­ing a RAMdisk is an excel­lent one con­sid­er­ing that I’m still run­ning CS4 which is 32-bit and direct­ly access­es only up to 8GB. Though I was sur­prised to see only a minute (my•newt, not minn•nut) speed advan­tage when I ran the tri­al of Pho­to­shop CS5.5 which, in 64-bit mode claimed to see all 32GB of memory.

      OK, writ­ing all that gave me time to rein­stall Pho­to­shop (when I recon­fig­ured my RAID appa­rant­ly I hosed my reg­is­tra­tion cre­den­tials) so now I can look at the file size (assum­ing I can find the right 3.5GB file.)

      The first file I opened is only one lay­er, 3.54GB on disk, grayscale 59840×59840. That does­n’t sound like there’s (even loss­less) com­pres­sion turned on there. I’m guess­ing that your 35,000×35,000 was 3‑byte col­or, which would make it about 3.6GB, right? After 18.5° rota­tion, that becomes 44,297×44,297 or rough­ly 5.5GB. My grayscale file goes to 75,736×75,736 — rough­ly the same. Add the sec­ond lay­er (which if emp­ty would not increase the file size but will nev­er­the­less increase mem­o­ry require­ment and Pho­to­shop would need at least 25GB to do what it does, and 21.5GB of that will auto­mat­i­cal­ly spool to disk (again, why your sug­ges­tion of cre­at­ing a RAMdisk is such a good one — at least for CS4 or ear­li­er on the Mac. You Win­dows peo­ple got 64-bit mem­o­ry address­ing with CS4 and I’ll assume that I did some­thing wrong to crip­ple my CS5.5 tri­al.) Leav­ing 19 more lev­els of undo enabled will add to that dra­mat­i­cal­ly. If my maths are cor­rect that’s anoth­er 105GB and it would­n’t sur­prise me if Pho­to­shop allo­cates all of that before it is used.

      Keep in mind that the point of the test was to com­pare the speed of dif­fer­ent dri­ve con­fig­u­ra­tions, not to see how well I could opti­mize Pho­to­shop in all regards. One of the machines was not mine and I did not want to mess with some­one else’s set­tings. The impor­tant thing is that the Pho­to­shop instal­la­tions were con­fig­ured the same, which I can most­ly vouch for.

      A lit­tle lat­er I’ll try these same tests again with my cur­rent con­fig­u­ra­tion on the Mac Pro (3‑drive RAID for sys­tem, 1 JBOD sit­ting idle) since this appears not to be a con­fig­u­ra­tion I ran this test on.

Leave a Reply