Data Engineer
A student application that is to be used by a hypothetical training institution. This institution offers various courses that the student can enroll in. See Github to browse code.
Understand and implement the star and snowflake database design schema. Take into account the processing and performace considerations when developing and implementing the design schema.
Develop Facts and Fact Tables, facilitate Fact and Dimensions Granularity, Confirmed and Non-Conformed Dimensions, Time Dimensions. See Github to view documentation of work
Understand the Kafka Architecture, its components, and some associated use cases. Utilize the Kafka command line to explore the contents of Kafka's /bin directory, create new topics and use the command line tool to produce and consume messages.
Implement Kafka Producer and Consumer clients using the Java API and send messages both synchronously and asynchronously. Establish use cases for exploring (utilizing) the components Kafka Connect to further understand pre-built connectors, and understanding the methodology behind using Kafka and Spark Streaming.
SQL Server
Learned the fundamentals of MongoDB that are needed to fully understand how to leverage its power to perform analytics. These fundamentals include topics like MongoDB's document model, importing data into a MongoDB cluster, or working with the MongoDB CRUD API, alongside with an introduction to B's Aggregation framework. See Github to view documentation of work
Create a database and a collection to store blog posts, while learning the basics of NoSQL document databases, MongoDB commands using the MongoDB shell and how to inspect, manage and optimize a database using the MongoDB Compass GUI
In this project the following was accomplished: create a database and collection using PyMongo, populate a collection with documents; query the collection. See Github to view documentation of work