Sorting
Deployments found: 3
Yelp has established a loyal consumer following, due in large part to the fact that they are vigilant in protecting the user from shill or suspect content. Yelp uses an automated review filter to identify suspicious content and minimize exposure to the consumer. The site also features a wide range of other features that help people discover new businesses (lists, special offers, and events), and communicate with each other. Additionally, business owners and managers are able to set up free accounts to post special offers, upload photos, and message customers. The company has also been focused on developing mobile apps and was recently voted into the iTunes Apps Hall of Fame. Yelp apps are also available for Android, Blackberry, Windows 7, Palm Pre and WAP. Local search advertising makes up the majority of Yelp’s revenue stream. The search ads are colored light orange and clearly labeled “Sponsored Results.” Paying advertisers are not allowed to change or re-order their reviews.
Why Amazon Web Services Yelp originally depended upon giant RAIDs to store their logs, along with a single local instance of Hadoop. When Yelp made the move to Amazon Elastic MapReduce (Amazon EMR), they replaced the RAIDs with Amazon Simple Storage Service (Amazon S3) and immediately transferred all Hadoop jobs to Amazon Elastic MapReduce.
“We were running out of hard drive space and capacity on our Hadoop cluster,” says Yelp search and data-mining engineer Dave Marin. Yelp uses Amazon S3 to store daily logs and photos, generating around 1.2TB of logs per day. The company also uses Amazon EMR to power approximately 20 separate batch scripts, most of those processing the logs. Features powered by Amazon Elastic MapReduce include:
- People Who Viewed this Also Viewed
- Review highlights
- Auto complete as you type on search
- Search spelling suggestions
- Top searches
- Ads
Why Amazon Web Services Coinbase evaluated different cloud technology vendors in late 2014, but it was most confident in Amazon Web Services (AWS). In his previous role at NASA’s Jet Propulsion Laboratory, Witoff gained experience running secure and sensitive workloads on AWS. Based on this, Witoff says he “came to trust a properly designed AWS cloud.” The company began designing the new Coinbase Exchange by using AWS Identity and Access Management (IAM), which securely controls access to AWS services. “Cloud computing provides an API for everything, including accidentally destroying the company,” says Witoff . “We think security and identity and access management done correctly can empower our engineers to focus on products within clear and trusted walls, and that’s why we implemented an auditable self-service security foundation with AWS IAM.” The exchange runs inside the Coinbase production environment on AWS, powered by a custom-built transactional data engine alongside Amazon Relational Database Service (Amazon RDS) instances and PostgreSQL databases. Amazon Elastic Compute Cloud (Amazon EC2) instances also power the exchange. The organization provides reliable delivery of its wallet and exchange to global customers by distributing its applications natively across multiple AWS Availability Zones. Coinbase created a streaming data insight pipeline in AWS, with real-time exchange analytics processed by an Amazon Kinesis managed big-data processing service. “All of our operations analytics are piped into Kinesis in real time and then sent to our analytics engine so engineers can search, query, and find trends from the data,” Witoff says. “We also take that data from Kinesis into a separate disaster recovery environment.” Coinbase also integrates the insight pipeline with AWS CloudTrail log files, which are sent to Amazon Simple Storage Service (Amazon S3) buckets, then to the AWS Lambda compute service, and on to Kinesis containers based on Docker images. This gives Coinbase complete, transparent, and indexed audit logs across its entire IT environment. Every day, 1 TB of data—about 1 billion events—flows through that path. “Whenever our security groups or network access controls are modified, we see alerts in real time, so we get full insight into everything happening across the exchange,” says Witoff . For additional big-data insight, Coinbase uses Amazon Elastic MapReduce (Amazon EMR), a web service that uses the Hadoop open-source framework to process data, and Amazon Redshift, a managed petabyte-scale data warehouse. “We use Amazon EMR to crunch our growing databases into structured, actionable Redshift data that tells us how our company is performing and where to steer our ship next,” says Witoff . All of the company’s networks are designed, built, and maintained through AWS CloudFormation templates. “This gives us the luxury of version-controlling our network, and it allows for seamless, exact network duplication for on-demand development and staging environments,” says Witoff . Coinbase also uses Amazon Virtual Private Cloud (Amazon VPC) endpoints to optimize throughput to Amazon S3, and Amazon WorkSpaces to provision cloud-based desktops for global workers. “As we scale our services around the world, we also scale our team. We rely on Amazon WorkSpaces for on-demand access by our contractors to appropriate slices of our network,” Witoff says. Coinbase launched the U.S. Coinbase Exchange on AWS in February 2015, and recently expanded to serve European users.
The Benefits Coinbase is able to securely store its customers’ funds using AWS. “I consider Amazon’s cloud to be our own private cloud, and when we deploy something there, I trust that my staff and administrators are the only people who have access to those assets,” says Witoff . “Also, securely storing bitcoin remains a major focus area for us that has helped us gain the trust of consumers across the world. Rather than spending our resources replicating and securing a new data center with solved challenges, AWS has allowed us to hone in on one of our core competencies: securely storing private keys.” Coinbase has also relied on AWS to quickly grow its customer base. “In three years, our bitcoin wallet base has grown from zero to more than 3 million. We’ve been able to drive that growth by providing a fast, global wallet service, which would not be possible without AWS,” says Witoff . Additionally, the company has better visibility into its business with its insight pipeline. “Using Kinesis for our insight pipeline, we can provide analytical insights to our engineering team without forcing them to jump through complex hoops to traverse our information,” says Witoff . “They can use the pipeline to easily view all the metadata about how the Coinbase Exchange is performing.” And because Kinesis provides a one-to-many analytics delivery method, Coinbase can collect metrics in its primary database as well as through new, experimental data stores. “As a result, we can keep up to speed with the latest, greatest, most exciting tools in the data science and data analytics space without having to take undue risk on unproven technologies,” says Witoff . As a startup company that built its bitcoin exchange in the cloud from day one, Coinbase has more agility than it would have had if it created the exchange internally. “By starting with the cloud at our core, we’ve been able to move fast where others dread,” says Witoff . “Evolving our network topology, scaling across the globe, and deploying new services are never more than a few actions away. This empowers us to spend more time thinking about what we want to do instead of what we’re able to do.” That agility is helping Coinbase meet the demands of fast business growth. “Our exchange is in hyper-growth mode, and we’re in the process of scaling it all across the world,” says Witoff . “For each new country we bring on board, we are able to scale geographically and at the touch of a button launch more machines to support more users.” By using AWS, Coinbase can concentrate even more on innovation. “We trust AWS to manage the lowest layers of our stack, which helps me sleep at night,” says Witoff . “And as we go higher up into that stack—for example, with our insight pipeline—we are able to reach new heights as a business, so we can focus on innovating for the future of finance.”
About Expedia
Expedia, Inc. is a leading online travel company, providing leisure and business travel to customers worldwide. Expedia’s extensive brand portfolio includes Expedia.com, one of the world’s largest full service online travel agency, with sites localized for more than 20 countries; Hotels.com, the hotel specialist with sites in more than 60 countries; Hotwire.com, the hotel specialist with sites in more than 60 countries, and other travel brands. The company delivers consumer value in leisure and business travel, drives incremental demand and direct bookings to travel suppliers, and provides advertisers the opportunity to reach a highly valuable audience of in-market travel consumers through Expedia Media Solutions. Expedia also powers bookings for some of the world’s leading airlines and hotels, top consumer brands, high traffic websites, and thousands of active affiliates through Expedia Affiliate Network.The Challenge
Expedia is committed to continuous innovation, technology, and platform improvements to create a great experience for its customers. The Expedia Worldwide Engineering (EWE) organization supports all websites under the Expedia brand. Expedia began using Amazon Web Services (AWS) in 2010 to launch Expedia Suggest Service (ESS), a typeahead suggestion service that helps customers enter travel, search, and location information correctly. According to the company’s metrics, an error page is the main reason for site abandonment. Expedia wanted global users to find what they were looking for quickly and without errors. At the time, Expedia operated all its services from data centers in Chandler, AZ. The engineering team realized that they had to run ESS in locations physically close to customers to enable a quick and responsive service with minimal network latency. Why Amazon Web Services Expedia considered on-premises virtualization solutions as well as other cloud providers, but ultimately chose Amazon Web Services (AWS) because it was the only solution with the global infrastructure in place to support Asia Pacific customers.“From an architectural perspective, infrastructure, automation, and proximity to the customer were key factors,” explains Murari Gopalan, Technology Director. “There was no way for us to solve the problem without AWS.”
Launching ESS on AWS
“Using AWS, we were able to build and deliver the ESS service within three months,” says Magesh Chandramouli, Principal Architect.
ESS uses algorithms based on customer location and aggregated shopping and booking data from past customers to display suggestions when a customer starts typing. For example, if a customer in Seattle entered sea when booking a flight, the service would display Seattle, SeaTac, and other relevant destinations. Expedia launched ESS instances initially in the Asia Pacific (Singapore) Region and then quickly replicated the service in the US West (Northern California) and EU (Ireland) Regions. Expedia engineers initially used Apache Lucene and other open source tools to build the service, but eventually developed powerful tools in-house to store indexes and queries. By deploying ESS on AWS, Expedia was able to improve service to customers in the Asia Pacific region as well as Europe.“Latency was our biggest issue,” says Chandramouli. “Using AWS, we decreased average network latency from 700 milliseconds to less than 50 milliseconds.”
Running Critical Applications on AWS
By 2011, Expedia was running several critical, high-volumes applications on AWS, such as the Global Deals Engine (GDE). GDE delivers deals to its online partners and allows them to create custom websites and applications using Expedia APIs and product inventory tools. Expedia provisions Hadoop clusters using Amazon Elastic Map Reduce (Amazon EMR) to analyze and process streams of data coming from Expedia’s global network of websites, primarily clickstream, user interaction, and supply data, which is stored on Amazon Simple Storage Service (Amazon S3). Expedia processes approximately 240 requests per second. “The advantage of AWS is that we can use Auto Scaling to match load demand instead of having to maintain capacity for peak load in traditional datacenters,” comments Gopalan. Expedia uses AWS CloudFormation with Chef to deploy its entire front and backend stack into its Amazon Virtual Private Cloud (Amazon VPC) environment. Expedia uses a multi-region, multi-availability zone architecture with a proprietary DNS service to add resiliency to the applications. Figure 2 demonstrates the architecture of the GDE service on AWS. Expedia can add a new cluster to manage GDE and other high volume applications without worrying about the infrastructure.“If we had to host the same applications on our on-premises data center, we wouldn’t have the same level of CPU efficiency,” says Chandramouli. “If an application processes 3,000 requests per second, we would have to configure our physical servers to run at about 30 percent capacity to avoid boxes running hot. On AWS, we can push CPU consumption close to 70 percent because we can always scale out. Fundamentally, running in AWS enables a 230 percent CPU consumption efficiency in data processing. We run our critical applications on AWS because we can scale and use the infrastructure efficiently.”
Using IAM to Manage Security
To simplify the management of GDE, Expedia developed an identity federation broker that uses AWS Identity and Access Management (AWS IAM) and the AWS Security Token Service (AWS STS). The federation broker allows systems administrators and developers to use their existing Windows Active Directory (AD) accounts to single sign-on (SSO) to the AWS Management Console. In doing so, Expedia eliminates the need to create IAM users and maintain multiple environments where user identities are stored. Federation broker users sign into their Windows machines with their existing Active Directory credentials, browse to the federation broker, and transparently log into the AWS Management Console. This allows Expedia to enforce password and permissions management within their existing directory and to enforce group policies and other governance rules. Additionally, if an employee ever leaves the company or takes a different role, Expedia simply make changes to Active Directory to revoke or changes AWS permissions for the user instead of inside of AWS.Standardizing Application Deployment
The success of the ESS and GDE services sparked interest from other Expedia development teams, who began to use AWS for regional initiatives. By 2012, Expedia was hosting applications in the US East (Northern Virginia), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), and US West (Northern California) Regions. Expedia Worldwide Engineering culled best practices from these initiatives to create a standardized deployment setup across all Regions. As Jun-Dai Bates-Kobashigawa, Principal Software Engineer explains,“We’re using Chef to automate the configuration of the Amazon Elastic Compute Cloud (Amazon EC2) servers. We can take any AWS image and use scripts stored in Chef to build a machine and spin up an instance customized for a team in just in a few minutes.”
The team consolidated all AWS accounts under one AWS account and provisioned one Amazon VPC network in each Region. This allows each Region to have an isolated infrastructure with a separate firewall, application layer, and database layer. Expedia applies Amazon EC2 Security Group firewall settings to safeguard applications and services. Amazon VPC is completely integrated into Expedia’s lab and production environments.“The Amazon VPC experience for the developer is totally seamless,” says Bates-Kobashigawa. “Developers use the same Active Directory service for authentication and may not even know that some of the servers that they log onto are running on AWS. It feels like a physical infrastructure with its own subnets and multiple layers, and it’s also easy to connect to our on-premises infrastructure using VPN.”
Expedia uses a blue-green deployment approach to create parallel production environments on AWS, enabling continuous deployment and faster time-to-market.“One of our metrics for success is the reduction of time to deploy within our teams,” says Gopalan. “We use this method to launch applications pretty quickly compared to a traditional deployment. Moreover, reducing the cost of a rollback to zero means we can be fearless with deployments.”
The Benefits
Expedia uses AWS to develop applications faster, scale to process large volumes of data, and troubleshoot issues quickly. By using AWS to build a standard deployment model, development teams can quickly create the infrastructure for new initiatives. Critical applications run in multiple Availability Zones in different Regions to ensure data is always available and to enable disaster recovery. Expedia Worldwide Engineering is working on building a monitoring infrastructure in all Regions and moving to a single infrastructure. Generally, teams have more control over development and operations on AWS. When Expedia experienced conversion issues for its Client Logging service, engineers were able to track and identify critical issues within two days. Expedia estimates that it would have taken six weeks to find the script errors if the service ran in a physical environment. Previously, Expedia had to provision servers for a full-load scenario in its data centers.“To deploy an application using our on-site facility, you have to think about the physical infrastructure,” Bates-Kobashigawa explains. “If there are 100 boxes running, you might have to take 20 boxes out to apply new code. Using AWS, we don’t have to take capacity out; we just add new capacity and send traffic to it.”
Chandramouli comments, “When I was developer, you didn’t want to invest in architecture if you didn’t know how the application would turn out. I had to plan upfront and build a proof of concept to present to stakeholders. By using AWS, I’m not bound by throughput limitations or CPU capacity. When I think of AWS, freedom is the first word that comes to mind.”
The ROI4CIO Deployment Catalog is a database of software, hardware, and IT service implementations. Find implementations by vendor, supplier, user, business tasks, problems, status, filter by the presence of ROI and reference.