Part of what makes working with big data difficult is the number of services and platforms that need to work together. You need Flume to ingest data, Hbase to store the data, HDFS to distribute the data across multiple nodes, Solr to index the data and make it searchable, and many more other pieces in between. Hence, that is why you have Apache Zookeeper as that service that gets the other services talking to each other. It’s that intermediary between the services that makes working with all these services far less complex (but even with Zookeeper, it’s still very complex).
To read up on Zookeeper from the source, here is the “official” documentation: http://zookeeper.apache.org/doc/trunk/