“We now have more data than the largest online encyclopedia as a result of leveraging Arbisoft’s expertise. They’re definitely web crawling experts.”Eric FitzVice President, Engineering and Product Development
Advanced Energy Economy or AEE is a group of businesses on a mission to nurture a prosperous economy powered by secure, sustainable, clean and affordable energy. They do this using policy advocacy, analysis and education — and to do that they need access to data. A lot of data. AEE’s product, PowerSuite, collects vast amounts of public data from Public Utility Commission websites for each U.S. state and then provides it to PowerSuite subscribers in an easy-to-understand aggregate they can use to obtain updated information, track long-term trends, and most importantly make better decisions. And this is where Arbisoft comes in.
After having established their business in North America, Wanderu had their eyes on the European market and reached out to Arbisoft achieve their goal. In order to expand into Europe, Wanderu needed European carriers and integrations for their app to increase their coverage. They partnered with Arbisoft and hired a dedicated team to augment their internal software engineering team. As a result of the partnership, we successfully integrated hundreds of carriers running through thousands of locations with the Wanderu platform using Python.
Later, Wanderu expanded the team and added Arbisoft’s iOS & Android developers. Thanks to our mobile engineering team, Wanderu launched an iOS app enabling travelers to search online schedules, make bookings, and calculate fare for trips on the move.
We used Twisted, Treq, and Scrapy to integrate the Wanderu API with Carrier API which allowed the app to access listings from a wider range of road and rail carriers, along with writing custom code to allow location mapping.
After success with integration and iOS teams, Wanderu expanded their work with Arbisoft and hired data engineering team.
Our work made data fetches hyper efficient, reducing crawler runtime from over a week (168+ hours) to less than 4 hours. This represented a whopping decrease of 97.6% in minimum runtime duration, significantly enhancing the ccrawlers’ ability to capture and present new information to subscribers as quickly as possible.
Despite the massive increase in performance, the expense of data fetches also reduced to half of that incurred by the conventional method. Shifting the API form Google App Engine to the Django REST Framework also yielded an additional $600 in savings per month for an approximate $7200 in savings per year.
Python, Django, Scrapy