The interview process took about two weeks and on the whole was pretty reasonable for both sides.
There was a short phone screen (~30 minutes)
Then I had two technical exercises to do. For each, I was given a reasonable time period (4 hours) that started when I visited a special link to the get the problem description, and then used whatever tools I wanted to get it done, then email in the completed exercise.
The two exercises were not completely trivial, but were relatively straightforward and actually involved skills relevant to the position and not arcane puzzles or comp-sci theory. (Basically, the first was to write a Postgres schema from scratch (total 5 or 6 tables and maybe 30 columns across all the tables). The second task was to write a CSV parser (including reasonable error checking/recovery) to load data into the schema from the first problem.
Finally, there was the main interview (conducted via Facetime, this is a remote position). That ran for close to 3 hours.
This worked well, since it gave me a feel for what the work would actually be like, and I was given, as the candidate, every oppurtunity to put my best foot forward - no whiteboard exercises, no surprises.
Someone has to write that CSV parser though, don't they?
And sometimes you actually need to write things like sort algorithms. I implemented an insertion sort from scratch recently, because it needed to be done in a particular way to take advantage of the capabilities of a specific JIT compiler.
People do actually do this stuff - it's not all academic.
Did you have to do this on a whiteboard or paper as part of the interview process for your job? The point: not being able to do this on a interview in completely contrived circumstances with all the anxiety that comes with close scrutiny of a person who had the power to determine your future career is not a good gauge of your ability to do so when matters, given all the resources we as developers generally afford ourselves.
My biggest problem is that the vast majority of these tests is they only test two skills, neither of which is all that critical to software development beyond a certain minimum threshold: recollection and pattern matching known solutions to familiar problems. In my current level of competency with a decade of tenure in software development is this: if you're asking me to solve problems which I can easily "solve" with a few minutes of searching the web, you probably don't want or need someone like me whose spent the majority of their career on big projects which require a multitude of disciplines from being creative to quantifying results to forethought of future use to systematically testing and releasing at a minimum of risk....
What I find ironic is that I've never been tested on some of the few rote tasks which I find most developers struggle with: committing/branching/merging/commenting code, producing post release documentation, developing robust API functions, etc etc
Scenario: The file is a 100GB CSV file. The machine is a VPS with 500MB RAM. The task: Determine whether the value in the first column of the first row, is ever repeated in the first column of any subsequent row.
An import solution is unlikely to work. Most import solutions would try to load the rows into memory, which doesn't work here. Maybe if the imported parser can be configured to run a user-supplied callback on individual rows and then discard them...
(P.S. solving the problem with awk still counts as writing a CSV parser --- in awk)
It's not even a library thing. Reading a file in this way does reads in approximately optimal (for the FS) sized blocks. The CSV parser just rides on top of stdio.
BufferedReader r = new BufferedReader(new File(filename))
String line = r.readLine();
if (line != null) {
String first = line.split(",")[0];
while ((line = r.readLine()) != null) {
if (first.equals(line.split(",")[0])) {
return true;
}
}
return false;
}
throw new RuntimeException();
would work just fine, and very quickly, too. This sort of question simply is not hard to solve; it's far more dependent on tiny implementation details. I very much doubt that any CSV parser would try to load the file all at once; it'd be too much of a performance hit.
In other words, you write your own CSV parser, in this case using Java.
Btw, CSV files can contain values with commas and even newlines inside of them. So if your point was that you don't have to write an _entire_ CSV parser, only a partial one, unfortunately that isn't true. https://en.wikipedia.org/wiki/Comma-separated_values#Example
>I very much doubt that any CSV parser would try to load the file all at once
I agree, but a naive imported solution would likely try to store all the rows at once, possibly in some sort of list or vector or array or whatever. This is what would cause the memory failure, not the file read itself.
Possibly, but unlikely (most csv libraries will be based around iterators). In any case, the problem is stupidly underspecified. For example, what if the 100GB CSV is 100 1GB rows? What if the input is UTF16, UTF32? How do you deal with the 10 Unicode line separators?
It just tests how well the interviewee knows CSV, which is an ill specified format anyway. It's a fake problem as no sane person would parse 100GB CSVs on 500MB VPS', and in real life, you'd just try the naive solution, see why it didn't work and iterate.
this is a valid record definition from a csv file:
"b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
Your code will fail on this case. CSV parsing is not as simple as it sounds. Given that the format is also not well-defined, it's even worse. (is the first line a header or not for example)
>and then used whatever tools I wanted to get it done, then email in the completed exercise.
This is a much more realistic way of testing someone's skills. Asking someone to write code while they watch is not.
I am personally horrible at coding while someone is looking over my shoulder. I am just too preoccupied with their presence and the fact that they are watching. And, unless a company is still into pair programming (is anyone these days?), it's not a valid test.
Give me a real problem, reasonable time to solve it, and the tools I'd have in the real world. Then, I can show you what I can do in that same real world vs. how well I interview in some contrived format.
The interview process took about two weeks and on the whole was pretty reasonable for both sides.
There was a short phone screen (~30 minutes)
Then I had two technical exercises to do. For each, I was given a reasonable time period (4 hours) that started when I visited a special link to the get the problem description, and then used whatever tools I wanted to get it done, then email in the completed exercise.
The two exercises were not completely trivial, but were relatively straightforward and actually involved skills relevant to the position and not arcane puzzles or comp-sci theory. (Basically, the first was to write a Postgres schema from scratch (total 5 or 6 tables and maybe 30 columns across all the tables). The second task was to write a CSV parser (including reasonable error checking/recovery) to load data into the schema from the first problem.
Finally, there was the main interview (conducted via Facetime, this is a remote position). That ran for close to 3 hours.
This worked well, since it gave me a feel for what the work would actually be like, and I was given, as the candidate, every oppurtunity to put my best foot forward - no whiteboard exercises, no surprises.