So, this means they either have a local copy on disk of whatever database they’re querying, or they’re dumping a remote db to disk at some point before/during/after their query, right?
So yeah, she’s apparently toting around an external hard drive with a copy of the “multiple terabytes” large US spending database, running queries against it, then dumping the 60k-row result set to CSV for further processing.
I’m still confused at what point the external drive overheats, even if she is doing all this in a “hot humid” hotel room that she can’t run any fans I guess because her kids were asleep?
But like, all of that just adds more questions, and doesn’t really answer the first one - why?
Unless they actually mean the hard drive, and not the computer. I’ve definitely had a cheap enclosure overheat and drop out on me before when trying to seek the drive a bunch, although it’s more likely the enclosure’s own electronics overheating. Unless their query was rubbish, a simple database scan/search like that should be fast, and not demanding in the slightest. Doubly so if it’s dedicated, and not using some embedded thing like SQLite. A few dozen thousand queries should be basically nothing.
So, this means they either have a local copy on disk of whatever database they’re querying, or they’re dumping a remote db to disk at some point before/during/after their query, right?
Either way, I have just one question - why?
Edit: found the thread with a more in-depth explanation elsewhere in the thread: https://xcancel.com/DataRepublican/status/1900593377370087648#m
So yeah, she’s apparently toting around an external hard drive with a copy of the “multiple terabytes” large US spending database, running queries against it, then dumping the 60k-row result set to CSV for further processing.
I’m still confused at what point the external drive overheats, even if she is doing all this in a “hot humid” hotel room that she can’t run any fans I guess because her kids were asleep?
But like, all of that just adds more questions, and doesn’t really answer the first one - why?
Even if it was local, a raspberry pi can handle a query that size.
Edit - honestly, it reeks of a knowledge level that calls the entire PC a “hard drive”.
Unless they actually mean the hard drive, and not the computer. I’ve definitely had a cheap enclosure overheat and drop out on me before when trying to seek the drive a bunch, although it’s more likely the enclosure’s own electronics overheating. Unless their query was rubbish, a simple database scan/search like that should be fast, and not demanding in the slightest. Doubly so if it’s dedicated, and not using some embedded thing like SQLite. A few dozen thousand queries should be basically nothing.
Yeah, no matter what way you disorganize 60,000 rows, the data is still going to read into memory once.