Wouldn't the file read callback have to wait until all the other connections are served?
Well, yes, the callback would have to wait for other events to be processed first. But it's all very ok.
Event driven model vs. thread per request
The origin for this concern usually stems from the dedicated thread per request architecture found in traditional web servers. It steers you away from thinking how Node.js works by interleaving small events.
The dedicated thread per request architecture is the way Apache, Tomcat, IIS and many other web servers or web application containers work. In these servers there is a predetermined amount of workers that handle all incoming requests. Each worker takes a single request and serves it completely from start to finish. It is responsible for reading the resources it needs, waiting for the reads to complete, doing any processing and returning a result. New requests go into waiting pattern until some worker is ready to take them.
Node.js works differently. All requests are taken in as they arrive. Instead of having multiple workers serving many requests there is only one thread. Each request processing gets the whole thread for itself for the time being. The catch is that as soon as some resource has to be waited on, you let go of the thread and allow others to use it. This is another way of expressing that Node.js performs I/O in an asynchronous and non-blocking way.
Serving a request is split into small chunks
Processing each request until some resource has to be waited on splits the processing into small chunks. The code executes until an asynchronous operation needs to be performed. This could be for example a file read or a database call. The asynchronous operation is set up, provided with a means to continue and pass its results later on, and then the thread is let go for others to use.
During waiting time Node.js allows other requests to proceed. In a similar way another request may be processed until it needs to wait on something. This continues until the results for the original operation are in and the thread is free for us to continue processing the results.
Small chunks from different requests interleave
The end result is that the chunks from different requests interleave. Processing of all requests advance a tiny bit at a time. While one request is waiting for something, others can be advanced, until they yet wait for something.
This is the way a healthy Node.js server works. It keeps pumping life into the application by advancing individual computations.
Going back to the original question. Yes, the file read waits for other requests to be handled. The code handling each of the 100 new connections is most likely very short in length and go quickly into reading something or otherwise waiting a resource. The cost of this is minimal and it works really well. Processing requests this way gives the likes of Node.js and nginx an edge and allows serving large number of requests.
You can find an example Express.js application with debug prints receiving 100 incoming requests while still serving the first one at this github repo.