brooklynsetr.blogg.se - Google path finder

Google path finder software#

Before we extract new data, existing data is moved to Hive tables, from where data can be further analyzed. Data is directly loaded to PostgreSQL table from where it can be efficiently queried. Python script extracts weather data hourly using API. We fixed them so that values can be sorted. We found below issues during transformations:ĭistance came with measure i.e. Data needs to be transformed before sending it to Managed tables from where they can be queried for further analytics. We build this staging areas after looking at data quality. Linux Bash script loads them into hive externally managed tables. Every Day script extracts data for 4 US regions, we have selected due to API limitations and dumps it as 4 CSV files in AMAZON S3 storage. We are using Stravalib library for calling Strava API. Python scripts extracts the data daily using Strava API. Batch layer can also be called as Data lake, as we dump Huge Strava data every day in that database. combining real time as well as batch data requirements into one unit we used engine known as Lambda Architecture, introduced by Nathan Marz Īrchitecture consist of batch layer, that stores the historical data and speed layer which process the near real time data, and serving layer which can be required to build visuals. Our approach to design these data requirements i.e. We need near real-time data as well as historical data about weather conditions for particular segment. Weather data organizes all important weather indicators by city. Unlike relational database design, which gives emphasize on normalization we followed the normalization techniques for big databases such as Hadoop/Hive. We could not get personal data of individual athlete as we mostly worked on freely available information. Leaderboard: stores the leader by segment. Streams- Entity stores the geo spatial data of particular segment. Segment: This entity stores the meta information about particular segment. With that exercise we came to know about our entities and their relationships.Īctivity- This entity stands for the activities by athlete. We captured our uses cases and revised them multiple times using UML diagrams.

Google path finder software#

Since the inception of project we were clear about use cases but just to make sure we are really designing our database that can cater any future analytics requirement, we used above procedure in our software design. Strava gives list of segments at particular place but it is quite cumbersome to search for segment with particular criteria. Due to his travel, he is in constant exploration mode for new places to run.

He has started using strava to monitor his activities recently. Wherever he goes he makes sure his running shoes are with him. Kevin is consultant by profession but athlete by passion. We tried to find solution to such uses cases by integrating freely available strava and weather data. If it had snowed at some location yesterday it is quite possible that segment is not in ideal condition though current weather conditions are good. Leaderboard shows the time taken for completion but it does not display weather conditions especially wind direction, which affects the completion time. For a new user, it is impossible to search for interesting segment by categories such as activity type, distance or weather. With PathFinder we are trying to address complex use cases which are important and not addressed by current available functionalities. Riders compete with each other by climbing on segment leaderboard. Strava lets you experience what we call social fitness - connecting and competing with each other via mobile and online apps. Strava is a community of athletes from all over the world.