INDUSTRIES

Arbisoft is your one-stop shop when it comes to your eLearning needs. Our Ed-tech services are designed to improve the learning experience and simplify educational operations.
Discover More
- "Working with Arbisoft has felt less like hiring a vendor and more like gaining a team of trusted colleagues. Their developers don’t just build what we ask, they think alongside us, offer smart suggestions, and care deeply about getting it right."
  Sarah Johnson / SVP of Product, Summit K12
Get cutting-edge travel tech solutions that cater to your users’ every need. We have been employing the latest technology to build custom travel solutions for our clients since 2007.
Discover More
- “Arbisoft has been my most trusted technology partner for now over 15 years. Arbisoft has very unique methods of recruiting and training, and the results demonstrate that. They have great teams, great positive attitudes and great communication.”
  Paul English / Co-Founder, KAYAK
As a long-time contributor to the healthcare industry, we have been at the forefront of developing custom healthcare technology solutions that have benefitted millions.
Discover More
- "I wanted to tell you how much I appreciate the work you and your team have been doing of all the overseas teams I've worked with, yours is the most communicative, most responsive and most talented."
  Matt Hasel / Program Manager, eHuman
We take pride in meeting the most complex needs of our clients and developing stellar fintech solutions that deliver the greatest value in every aspect.
Discover More
- “Arbisoft is an integral part of our team and we probably wouldn't be here today without them. Some of their team has worked with us for 5-8 years and we've built a trusted business relationship. We share successes together.”
  Jake Peters / CEO & Co-Founder, PayPerks
Unlock innovative solutions for your e-commerce business with Arbisoft’s seasoned workforce. Reach out to us with your needs and let’s get to work!
Discover More
- "The development team at Arbisoft is very skilled and proactive. They communicate well, raise concerns when they think a development approach wont work and go out of their way to ensure client needs are met."
  Veronika Sonsev / Co-Founder
Arbisoft is a holistic technology partner, adept at tailoring solutions that cater to business needs across industries. Partner with us to go from conception to completion!
Discover More
- “The app has generated significant revenue and received industry awards, which is attributed to Arbisoft’s work. Team members are proactive, collaborative, and responsive”.
  Silvan Rath / CEO, Predict.io

Anthropic - Introducing New Computer Capabilities with Claude 3.5 Sonnet and Claude 3.5 Haiku

Amna ManzoorPosted on October 24, 2024

5-6 Min Read Time

Anthropic has introduced two major upgrades to its AI lineup; Claude 3.5 Sonnet and Claude 3.5 Haiku. Alongside these advancements, a new computer use feature has been launched in a public beta. These developments push the boundaries of software development automation, coding, and computer navigation, bringing new possibilities for developers and businesses alike.

Claude 3.5 Sonnet: Enhancing Software Engineering

The Claude 3.5 Sonnet offers significant upgrades over its previous version, illustrating the ongoing evolution of LLMs with enhanced abilities in coding and automation. This model shines in agentic coding tasks, improving its performance on benchmarks like SWE-bench Verified, moving from 33.4% to 49%, outperforming publicly available models, including OpenAI’s o1-preview. It also scored higher on the TAU bench, used to assess tool-based problem-solving:

Retail domain: From 62.6% to 69.2%
Airline domain: From 36% to 46%

These gains come with no added cost or latency, making Claude 3.5 Sonnet an ideal solution for complex, multi-step development tasks. Companies such as GitLab have reported up to 10% better reasoning on DevSecOps tasks. The Browser Company also found the model to be exceptional for automating web-based workflows.

This model has been tested rigorously in partnership with the US and UK AI Safety Institutes to ensure safe deployment. Its compliance with the ASL-2 Standard, part of Anthropic’s Responsible Scaling Policy, confirms that it meets safety benchmarks required for broader use.

Screenshot 2024-10-24 at 2.31.52 PM.png

Image Source: Anthropic

Claude 3.5 Haiku: Affordable, Fast, and Capable AI

The new Claude 3.5 Haiku model is designed for speed and cost-efficiency while matching the performance of Claude 3 Opus, which is the Anthropic’s largest previous model, across many evaluations. This model demonstrates excellent results in low-latency tasks, making it suitable for real-time applications like user-facing products and data-intensive tasks.

Claude 3.5 Haiku scores 40.6% on SWE-bench Verified, outperforming earlier Claude models and even GPT-4o in some areas. It provides accurate tool usage and improved instruction-following capabilities, making it effective for generating personalized experiences from large datasets, such as purchase history, pricing records, or inventory data.

This model will be available later in October, via Anthropic’s API, Amazon Bedrock, and Google Cloud Vertex AI. Initially, it will support text-only tasks, with image input functionality expected soon.

AI-Driven Computer Use in Public Beta

One of the most exciting features Anthropic has introduced is Claude’s ability to use computers. Now in public beta, developers can use Claude to perform tasks just like a human, such as navigating screens, typing, clicking, and more. This feature allows the model to automate repetitive processes, conduct open-ended research, and even test software across multiple platforms.

Early adopters like Replit are already using this capability for automating complex UI navigation tasks, helping their Replit Agent product evaluate applications as they are developed.

In tests conducted by OSWorld, Claude 3.5 Sonner scored 22% when given more time to complete a task, outperforming other AI models that scored just 7.8%. Even so, the feature is still experimental and has some limitations. Tasks that require scrolling, zooming, or dragging can be challenging for the AI to perform smoothly. Developers are advised to start with low-risk projects to explore its potential. Anthropic promises ongoing improvements to this feature based on the feedback.

Ensuring Safe Deployment

To address concerns around security risks, such as spam, fraud, or misinformation, Anthropic has developed new classifiers to monitor and prevent misuse of the computer use feature. This proactive approach helps ensure the responsible deployment of AI-driven automation.

Dataset and Training Details of Claude Models

According to Google Cloud, all Claude models are trained through several techniques:

Unsupervised learning (learning from patterns in raw data)
Reinforcement Learning with Human Feedback (RLHF) (improving with feedback from people)
Constitutional AI (a process involving both supervised learning and reinforcement learning).

Training Infrastructure

Claude 3.5 Sonnet v2 is trained using cloud services provided by Amazon Web Services (AWS) and Google Cloud Platform (GCP). The main frameworks used for development include PyTorch, JAX, and Triton.

Sources of Training Data

Claude models use a mix of data that includes:

Public internet information that was collected up to August 2023, with Claude 3.5 Sonnet v2’s training ending in April 2024.
Non-public data from third parties, which includes content created or labeled by users, companies, or hired service providers.
Internally generated data by Anthropic for refining the model.

Data Cleaning and Filtering

To ensure high-quality data, Anthropic applies methods like deduplication (removing repeated information) and classification to filter out irrelevant or low-quality data.

Crawling Practices

When gathering public data from websites, Anthropic follows responsible crawling practices:

robots.txt files and other website signals are respected to ensure compliance with site owners' preferences.
Anthropic does not access password-protected or sign-in pages or bypass CAPTCHAs to collect data.
Their web-crawling system operates transparently, making it easy for site owners to identify visits and communicate their preferences to Anthropic.

What’s Next

The Claude Sonnet 3.5 is already available for use, and the Claude 3.5 Haiku will be released later in October. Both models along with the computer use feature, can be accessed via Anthropic’s API, Amazon Bedrock, and Google Cloud Vertex AI. As these innovations evolve, Anthropic invites developers to provide feedback and experiment with these tools in safe practical applications.

Just published

img-https://d1foa0aaimjyw4.cloudfront.net/Header_Img_5_6adbfc6a37.png

AI-Powered Test Automation in Software QA: Redefining Speed and Accuracy in Continuous DeliveryRead more

img-https://d1foa0aaimjyw4.cloudfront.net/AWC_Blog_Document_Accessibility_Made_Easy_A_Practical_Guide_for_Everyone_Jamil_Hussain_f53e2a2122.png

Document Accessibility Made Easy: A Practical Guide for EveryoneRead more

img-https://d1foa0aaimjyw4.cloudfront.net/Zero_Shot_Learning_Training_Models_for_Unseen_Data_Classes_7c2334a5b9.png

Zero-Shot Learning: Training Models for Unseen Data ClassesRead more

...Loading Related Blogs

Explore More

Have Questions? Let's Talk.

We have got the answers to your questions.

Trusted by Market Leaders in Education, Travel, Finance and E-commerce since 2007

We put excellence, value and quality above all - and it shows

NPS

INDUSTRIES

Real-time Maintenance Reporting

Workflow Automation Platform

Recruitment Automation Tool

Learner Engagement Platform

Customer Feedback Analytics

School Communication Suite

Digital Learning Suite

Software Development Outsourcing

Dedicated Teams

IT Staff Augmentation

New Venture Partnership

Anthropic - Introducing New Computer Capabilities with Claude 3.5 Sonnet and Claude 3.5 Haiku

Claude 3.5 Sonnet: Enhancing Software Engineering

Claude 3.5 Haiku: Affordable, Fast, and Capable AI

AI-Driven Computer Use in Public Beta

Ensuring Safe Deployment