Thanks & Regards
Balaji Bobby
KRG Technologies, Inc.,
661 367 8000 Ext :251
Ø Build the Event Hubs integration with Service Fabric micro services implementation. Streaming the processed files from blobs into EH for downstream processing.
Ø Anonymized files (~1000 of them and to a size of ~GB) will be given as input
Ø Service Fabric code portion will be provided.
Ø Build the Spark processing reading off EventHubs, implementation in either Python or Scala would suffice.
Ø Look at the caching needs; leverage .cache to retain appropriate results from Spark ‘Actions’ in Spark executors
Ø Our team will evaluate a set of data store that would be a landing spot post Spark – Blobs being a required one. We will pick 1 or 2 from this list -- SQL DW, Azure SQL DB, Cassandra and DocumentDB being other candidate stores and we will have code snippets and/or guidance
Ø on latency, throughout with percentiles.
Ø Leverage APM tools as appropriate
Locals only - Redmond, WA
Client - HCL America Inc / MicroSoft