kickaha: (Default)
[personal profile] kickaha
No, you may *not* read a 4.8GB results file into memory, even with VM mapping, on a 32-bit system.

heh

Date: 2005-05-28 07:27 pm (UTC)
From: [identity profile] cspowers.livejournal.com
This past semester, a co-worker and I sponsored a student project at NCSU. Basically it was just a tool to do some number crunching to look for unusual behavior based on a user specified tolerance level.

At first the team thought this was too simple a project and not that interesting. But when we pointed out to them that their data source was typically 20gb or more and there's no way they could possibly keep all their intermediary calculations in memory at once they got kinda big-eyed.

In the end they came through with flying colors thanks to some mentoring from my co-worker T. I think it was the heftiest piece of design they'd ever done.

Re: heh

Date: 2005-05-28 11:00 pm (UTC)
From: [identity profile] kickaha.livejournal.com
*laugh* Yeah, large dataspace manipulation is a pain. Up until this, the largest data I'd worked with was ~200MB, and this popped up in the middle of a bunch of tests unexpectedly.

I'm going to look into how to kick Python into doing the file read a bit more intelligently (perhaps breaking it up into pieces of a GB or so, ensuring that I don't break across a chunk I need to keep coherent during analysis), but it's a great justification for a Dual G5 and 10.4! :D

"No, really, I *need* the big iron to do my research honey, I *swear*!" ;)

Profile

kickaha: (Default)
kickaha

January 2020

S M T W T F S
   1234
5678 91011
12131415161718
19202122232425
262728293031 

Style Credit

Expand Cut Tags

No cut tags