Analyzed Keyword Density

fast.ai · Making neural nets uncool again

Analyze your website by entering a full url

Primary Keywords

Your most important keywords should be between 3% and 4%
Keyword Percent
the5.21%
and2.68%
you1.50%
that1.47%
for1.43%
instance1.17%
this1.01%
with0.90%
your0.87%
can0.77%

Important Keywords

Keyword Percent
None

Analyzed Text

fast ai Making neural nets uncool again Home About Our MOOC Posts by Topic Benson Nyabuti Mainye and Harold Nguyen Bottom row Dennis Graham Sarada Lee and Cory Spencer Building Tools for Microbiologists in Kenya Benson Nyabuti Mainye trained as a microbiologist in his home country of Kenya He noticed that lab scientists can spend up to 5 hours studying a slide through a microscope to try to identify what cell types were in it and he wanted a faster alternative Benson created an immune cell classifier to distinguish various immune cells eosinophils basophils monocytes and lymphocytes within an image of a blood smear This fall he traveled to San Francisco to attend part of the fast ai course in person at the USF Data Institute a new session starts next month and where another fast ai classmate Charlie Harrington helped him deploy the immune cell classifier Since malaria is one of the top 10 causes of death in Kenya Benson is currently working with fellow Kenyan and fast ai alum Gerald Muriuki on a classifier to distinguish different types of mosquitoes to isolate particular types that carry the Plasmodium species the parasite which causes malaria Dairy Goat Farming Cory Spencer is a dairy goat farmer on bucolic Vancouver Island and together with his wife owns The Happy Goat Cheese Company When one of his goats came down with mastitis an udder infection Cory was unable to detect it until after the goat had suffered permanent damage Estimates suggest that mastitis costs the dairy industry billions of dollars each year By combining a special camera that detects heat temperatures are higher near an infection together with deep learning Cory developed a tool to identify infections far earlier at a subclinical level and for one tenth the cost of existing methods Next up Cory is currently building a 3D model to track specific parts of udders in real time towards the goal of creating an automatic goat milking robot since as Cory says “The cow guys already have the fancy robotic tech but the goat folk are always neglected ” Cory Spencer s goats State of the art Results in Cancer Genomics Alena Harley is working to use genetic information to improve cancer treatment in her role as head of machine learning at Human Longevity Institute While taking the fast ai course she achieved state of the art results for identifying the source of origin of metastasized cancer which is relevant for treatment She is currently working on accurately identifying somatic variants genetic mutations that can contribute to cancer automating what was previously a slow manual process One of Alena Harley s posts about her work on cancer metastasis From Accountant to Deep Learning Practitioner working on Solar Energy Sarada Lee was a former accountant looking to transition careers when she began a machine learning meetup in her living room in Perth Australia as a way to study the topic That informal group in Sarada’s living room has now grown into the Perth Machine Learning Meetup which has over 1 400 members and hosts 6 events per month Sarada traveled to San Francisco to take the Practical Deep Learning for Coders and Cutting Edge Deep Learning for Coders courses in person at the USF Data Institute and shared what she learned when she returned back to Perth Sarada recently won a 5 week long hackathon on the topics of solar panel identification and installation size prediction from aerial images using U nets As a result she and her team have been pre qualified to supply data science services to a major utility company which is working on solar panel adoption for an area the size of UK with over 1 5 million users Other applications they are working on include electricity network capacity planning predicting reverse energy flow and safety implications and monitoring the rapid adoption of solar Part of the Perth Machine Learning team with their BitLit Booth at Fringe World Sarada is 2nd from the left Sarada and the Perth Machine Learning Meetup are continuing their deep learning outreach efforts Last month a team led by Lauren Amos created an interactive creative display at the Fringe World Festival to make deep learning more accessible to the general public This was a comprehensive team effort and the display included artistic panels design based on style transfer GRU RNN generated poems Implemented BERT to generate poems or short books Applied speech to text and text to speech APIs to interact with a poetry generating robot Festival attendees were able to enjoy the elegant calligraphy of machine generated poems read chapters of machine generated books and even request a robot to generate poems given a short seed sentence Over 4 000 poems were generated during the course of the 2 week festival Cutting edge Medical Research at Age 73 At age 73 Dennis Graham is using deep learning to diagnose Parkinson’s disease from Magneto Encepholo Graphy MEG as part of a UCH Anschutz Neurology Research center project Dennis is painfully familiar with Parkinson’s as his wife has been afflicted with it for the last 25 years MEG has the advantages of being inexpensive readily available and non intrusive but previous techniques had not been analytically accurate when evaluating MEG data For two years the team struggled unable to obtain acceptable results using traditional techniques until Dennis switched to deep learning applying techniques and code he learned in the fast ai course It turns out that the traditional pre processing was removing essential data that a neural network classifier could effectively and easily use With deep learning Dennis is now achieving much higher accuracy on this problem Despite his successes it hasn’t all been easy and Dennis has had to overcome the ageism of the tech industry as he embarked on his second career A First Generation College Student Working in Cybersecurity Harold Nguyen ’s parents arrived in the United States as refugees during the Vietnam War Harold is a first generation Vietnamese American and the first in his family to attend college He loved college so much that he went on to obtain a PhD in Particle Physics and now works in cybersecurity Harold is using deep learning to protect brands from bad actors on social media as part of his work in digital risk for Proofpoint Based on work he did with fast ai he created a model with high accuracy that was deployed to production at his company last month Earlier during the course Harold created an audio model to distinguish between the voices of Ben Affleck Elon Musk and Joe Rogan What problem will you tackle with deep learning Are you facing a problem in your field that could be addressed by deep learning You don’t have to be a math prodigy or have gone to the most prestigious school to become a deep learning practitioner The only pre requisite for the fast ai course available in person or online is one year of coding yet it teaches you the hands on practical techniques needed to achieve state of the art results I am so proud of what fast ai students and alums are achieving As I shared in my TEDx talk I consider myself an unlikely AI researcher and my goal is to help as many unlikely people as possible find their way into the field Onstage during my talk AI needs all of us at San Francisco TEDx Further reading You may be interested to read about some other fantastic projects from fast ai students and alumni in these posts Artificial Intelligence Education Transforms The Developing World Forbes Deep Learning Not Just for Silicon Valley fast ai forums Share your work here fastec2 script Running and monitoring long running tasks 15 Feb 2019 Jeremy Howard This is part 2 of a series on fastec2 For an introduction to fastec2 see part 1 Spot instances are particularly good for long running tasks since you can save a lot of money and you can use more expensive instance types just for the period you’re actually doing heavy computation fastec2 has some features to make this use case much more convenient Let’s see an example Here’s what we’ll be doing Use an inexpensive on demand monitoring instance for collecting results and optionally for launching the task We’ll call this od1 in this guide but you can call it anything you like Create a script to do the work required and put any configuration files it needs in a specific folder The script will need to be written to save results to a specific folder so they’ll be saved Test the script works OK in a fresh instance Run the script under fastec2 which will cause it to be launched inside a tmux session on a new instance with the required files copied over and any results copied back to od1 as they’re created While the script is running check its progress either by connecting to the tmux session it’s running in or looking at the results being copied back to od1 as it runs When done the instance will be terminated automatically and we’ll review the results on od1 Let’s look at the details of how this works and how to use it Later in this post we’ll also see how to use fastec2’s volumes and snapshots functionality to make it easier to connect to large datasets Setting up your monitoring instance and script First create a script that completes the task you need When running under fastec2 the script will be launched inside a directory called fastec2 and this directory will also contain any extra files that aren’t already in your AMI needed for the script and will be monitored for changes which are copied back to your on demand instance od1 in this guide Here’s a example we’ll call it myscript sh we can use for testing usr bin env bash echo starting But it can be a cheap instance type if you’ve had your AWS account for less than 1 year then you can use a t2 micro instance for free Otherwise a t3 micro is a good choice—it should cost you around US 7 month plus storage costs if you leave it running To run your script under fastec2 you need to provide the following information The name of the instance to use first create it with launch The name of your script Additional arguments myip MYIP user USER keyfile KEYFILE to connect to the monitoring instance to copy results to If no host is provided it uses the IP of the computer where fe2 is running E g this command will run myscript sh on spot2 and copy results back to 18 188 16 203 fe2 launch spot2 base 80 m5 large spot fe2 script myscript sh spot2 18 188 162 203 Here’s what happens after you run the fe2 script line above A directory called fastec2 spot2 is created on the monitoring instance if it doesn’t already exist it is always a subdirectory of fastec2 and is given the same name as the instance you’re connecting to which in this case is spot2 Your script is copied to this directory This directory is copied to the target instance in this case spot2 A file called fastec2 current is created on the target instance containing the name of this task “ spot2 in this case” lsyncd is run in the background on the target instance which will continually copy any new changed files from fastec2 spot2 on the target instance to the monitoring instance fastec2 spot2 myscript sh is run inside the tmux session If you want the instance to terminate after the script completes remember to include systemctl poweroff for Ubuntu or similar at the end of your script Creating a data volume One issue with the above process is that if you have a bunch of different large datasets to work with you either need to copy all of them to each AMI you want to use which is expensive and means recreating that AMI every time you add a dataset or creating a new AMI for each dataset which means as you change your configuration or add applications that you have to change all your AMIs An easier approach is to put your datasets on to a separate volume that is an AWS disk fastec2 makes it easy to create a volume formatted with ext4 which is the most common type of filesystem on Linux To do so it’s easiest to use the fastec2 REPL see the last section of part 1 of this series for an introduction to the REPL since we need an ssh object which can connect to an instance to mount and format our new volume For instance to create a volume using instance od1 assuming it’s already running fe2 i IPython 6 1 0 An enhanced Interactive Python Type for help In 1 inst e get instance od1 In 2 ssh e ssh inst In 3 vol e create volume ssh 20 In 4 vol Out 4 od1 vol 0bf4a7b9a02d6f942 in use 20GB In 5 print ssh run ls l mnt fe2 disk total 20 rw rw r 1 ubuntu ubuntu 2 Feb 20 14 36 chk drwx 2 ubuntu root 16384 Feb 20 14 36 lost found As you see the new disk has been mounted on the requested instance under the directory mnt fe2 disk and the new volume has been given the same name od1 as the instance it was created with You can now connect to your instance and copy your datasets to this directory and when you’re done unmount the volume sudo umount mnt fe2 disk in your ssh session and then you can detach the volume with fastec2 If you do’nt have your previous REPL session open any more you’ll need to get your volume object first then you can detach it In 1 vol e get volume od1 In 2 vol Out 2 od1 vol 0 bf4a7b9a02d6f942 in use 20 GB In 3 e detach volume vol In 4 vol Out 4 od1 vol 0 bf4a7b9a02d6f942 available 20 GB In the future you can re mount your volume through the repl In 5 e mount volume ssh vol Using snapshots A significant downside of volumes is that you can only attach a volume to one instance at a time That means you can’t use volumes to launch lots of tasks all connected to the same dataset Instead for this purpose you should create a snapshot A snapshot is a template for a volume any volumes created from this snapshot will have the same data that the original volume did Note however that snapshots are not updated with any additional information added to volumes—the data originally included in the snapshot remains without any changes To create a snapshot from a volume assuming you already have a volume object vol as above and you’ve detached it from the instance In 7 snap e create snapshot vol name snap1 You can now create a volume using this snapshot which attaches to your instance automatically In 8 vol e create volume ssh name vol1 snapshot snap1 Summary Now we’ve got all the pieces of the puzzle In a future post we’ll discuss best practices for running tasks using fastec2 using all these pieces—but here’s the quick summary of the process Launch an instance and set it up with the software and configuration you’ll need Create a volume for your datasets if required and make a snapshot from it Stop that instance and create an AMI from it optionally you can terminate the instance after that is done Launch a monitoring instance using an inexpensive instance type Launch a spot instance for your long running task Create a volume from your snapshot attached to your spot instance Run your long running task on that instance passing the IP of your monitoring instance Ensure that your long running task shuts down the instance when done to avoid paying for the instance after complete You may also want to delete the volume created from the snapshot at that time To run additional tasks you only need to repeat the last 4 steps You can automate that process using the API calls shown in this guide fastec2 AWS computer management for regular folks 15 Feb 2019 Jeremy Howard This is part 1 of a series on fastec2 To learn how to run and monitor long running tasks with fastec2 check out part 2 AWS EC2 is a wonderful system it allows anyone to rent a computer for a few cents an hour including a fast network connection and plenty of disk space I’m particularly grateful to AWS because thanks to their Activate program we’ve got lots of compute credits to use for our research and development at fast ai But if you’ve spent any time working with AWS EC2 then for setting it up you’ve probably found yourself stuck between the slow and complex AWS Console GUI and the verbose and clunky command line interface CLI There are various tools available to streamline AWS management but they tend towards the power user end of the spectrum written for people that are deploying dozens of computers in complex architectures Where’s the tool for regular folks Folks who just want to launch a computer or two for getting some work done and shutting it down when it’s finished Folks who aren’t really that keen to learn a whole bunch of AWS specific jargon about VPCs and Security Groups and IAM Roles and oh god please just make it go away… The delights of the AWS Console Contents Overview Installation and configuration Creating your initial on demand instance Creating your Amazon Machine Instance AMI Launching and connecting to your instance Launching a spot instance Using the interactive REPL and ssh API Since I’m an extremely regular folk myself I figured I better go write that tool So here it is fastec2 Is it for you Here’s a summary of what it is designed to make easy ‘ instance ’ here simply means ‘ AWS computer ’ Launch a new on demand or spot instance See what instances are running Start an instance Connect to a named instance using ssh Run a long running script in a spot instance and monitor and save results Create and use volumes and snapshots including automatic formatting mounting Change the type of an instance e g add or remove a GPU See pricing for on demand and spot instance types Access through either a standard command line or through a Jupyter Notebook API Tab completion IPython command line interactive REPL available for further exploration I expect that this will be most useful to people who are doing data analysis data collection and machine learning model training Note that fastec2 is not designed to make it easy to manage huge fleets of servers on set up complex network architectures or to help with deployment of applications If you’re wanting to do that you might want to check out Terraform or CloudFormation To see how it works let’s do a complete walkthru of creating a new Amazon Machine Image AMI then lauching an AMI from this instance and connecting to it We’ll also see how to launch a spot instance running a long running script on it and collect the results of the script I’m assuming you already have an AWS account and know the basics of connecting to instances with ssh If you’re not sure about this bit first you should follow this tutorial on DataCamp Note that much of the coolest functionality in fastec2 is being provided by the wonderful Fire Paramiko and boto3 libraries—so a big thanks to all the wonderful people that made these available Overview The main use case that we’re looking to support with fastec2 is as follows you want to interactively start and stop machines of various types each time getting the same programs data and configuration automatically Sometimes you’ll create an on demand image and start and stop it as required You may also want to change the instance type occassionally such as adding a GPU or increasing the RAM This can be done instantly with a single command Sometimes you’ll fire up a spot instance in order to run a script and save the results such as for training a machine learning model or completing a web scraping task The key to having this work well is to set up an AMI which is set up just as you need it You may think of an AMI as being something that only sysadmin geniuses at Amazon build for you but as you’ll see it’s actually pretty quick and easy By making it easy to create and use AMIs you can then easily create the machines you need when you need them Everything in fastec2 can also be done through the AWS Console and through the official AWS CLI Furthermore there’s lots of things that fastec2 can’t do—it’s not meant to be complete it’s meant to be convenient for the most commonly used functionality But hopefully you’ll discover that for what it provides it makes it easier and faster than anything else out there… Installation and configuration You’ll need python 3 6 or later we highly recommend installing Anaconda if you’re not already using python 3 6 It lets you have as many different python versions as you want and different environments and switch between them as needed To install fastec2 pip install git https github com fastai fastec2 git You can also save some time by installing tab completion for your shell See the readme for setup steps for this Once installed hit Tab at any point to complete a command or hit Tab again to see possible alternatives fastec2 uses a python interface to the AWS CLI to do its work so you’ll need to configure this The CLI uses region codes instead of the region names you see in that console To find out the region code for the region you wish to use fastec2 can help To run the fastec2 application type fe2 along with a command name and any required arguments The command region will show the first code that matches the case sensitive substring you provide eg note that I’m using ‘ ’ to indicate the lines you type and other lines are the responses fe2 region Ohio us east 2 Now that you have your region code you can configure AWS CLI aws configure AWS Access Key ID XXX AWS Secret Access Key XXX Default region name us east 2 For information on setting this up including getting your access keys for AWS see Configuring the AWS CLI Creating your initial on demand instance Life is much easier when you can rapidly create new instances which are all set up just how you like them with the right software installed data files downloaded and configuration set up You can do this by creating an AMI which is simply a “frozen” version of a computer that you’ve set up and can then recreate as many times as you like nearly instantly Therefore we will first set up an EC2 instance with whatever we’re going to need we’ll call this your base instance You might already have an instance set up in which case you can skip this step One thing that will make things a bit easier is if you ensure you have a key pair on AWS called “ default ” If you don’t go ahead and upload or create one with that name now Although fastec2 will happily use other named keys if you wish you’ll need to specify the key name every time if you don’t use “default” You don’t need to make your base instance disk very big since you can always use a larger size later when you launch new instances using your AMI Generally 60GB is a reasonable size to choose To create our base image we’ll need to start with some existing AMI that contains a Linux distribution If you already have some preferred AMI that you use feel free to use it otherwise we suggest using the latest stable Ubuntu image To get the AMI id for the latest Ubuntu type fe2 get ami id ami 0c55b159cbfafe1f0 This shows a powerful feature of fastec2 all commands that start with “ get ” return an AWS object on which you can call any method or property each of these commands also has a version without the get prefix which prints a brief summary of the object instead of returning it Type your method or property name after a hyphen as shown above In this case we’re getting the ‘id’ property of the AMI object returned by get ami which defaults to the latest stable Ubuntu image see below for examples of other AMIs To see the list of properties and methods simply call the command without a property or method added fe2 get ami Usage fe2 get ami fe2 get ami architecture fe2 get ami block device mappings fe2 get ami create tags fe2 get ami creation date Now you can launch your instance—this creates a new “on demand” Linux instance and when complete it’ll take a couple of minutes it will print out the name id status and IP address The command will wait until ssh is accessible on your new instance before it returns fe2 launch base ami 0c55b159cbfafe1f0 50 m5 xlarge base i 00c7f2f81a841b525 running 18 216 25 57 The fe2 launch command takes a minimum of 4 parameters the name of the instance to create the ami to use either id or name—here we’re using the AMI id we retrieved earlier the size of the disk to create in GB and the instance type You can learn about the different instance types available from this AWS page To see the pricing of different instances you can use this command replace m5 with whichever instance series you’re interested in note that currently only US prices are displayed and they may not be accurate or up to date—use the AWS web site for full price lists fe2 price demand m5 m5 large 0 096 m5 metal 4 608 m5 xlarge 0 192 m5 2xlarge 0 384 m5 4xlarge 0 768 m5 12xlarge 2 304 m5 24xlarge 4 608 With our instance running we can now connect to it with ssh fe2 connect base Welcome to Ubuntu 18 04 2 LTS GNU Linux 4 15 0 1032 aws x86 64 Last login Fri Feb 15 22 10 28 2019 from 4 78 240 2 ubuntu ip 172 31 13 138 Now you can configure your base instance as required so go ahead and apt install any software you want copy over data files you’ll need and so forth In order to use some features of fastec2 discussed below you’ll need tmux and lsyncd installed in your AMI so go ahead and install then now sudo apt install y tmux lsyncd Also if you’ll be using the long running script functionality in fastec2 you’ll need a private key in your ssh directory which has permission to connect to another instance to save results of the script So copy your regular private key over if it’s not too sensitive or create a new one type ssh keygen and grab the ssh id dsa pub file it creates Check make sure you’ve done the following in your instance before you make it into an AMI installed lsyncd and tmux copied over your private key If you want to connect to jupyter notebook or any other service on your instance you can use ssh tunneling To create ssh tunnels add an extra argument to the above fe2 connect command passing in either a single int one port or an array multiple ports e g Tunnel to just jupyter notebook running on port 8888 fe2 connect od1 8888 Two tunnels jupyter notebook and a server running on port 8008 fe2 connect od1 8888 8008 This doesn’t do any fancy fowarding between different machines on the networks it’s just a direct connection from the computer you run fe2 connect on to your computer you’re ssh’ing to So generally you’ll run this on your own PC and then access for Jupyter http localhost 8888 in your browser Creating your Amazon Machine Instance AMI Once you’ve configured your base instance you can create your own AMI fe2 freeze base ami 01b7ceef9767a163a Here ‘ freeze ’ is the command and ‘ base ’ is the argument Replace myname with the name of your base instance that you wish to “freeze” into an AMI Note that your instance will be rebooted during this process so ensure that you’ve saved any open documents and it’s OK to shut down It might take 15 mins or so for the process to complete for very large disks of hundreds of GB it could take hours To check on progress either look in the AMIs section of the AWS console or type this command it will display ‘pending’ whilst it is still creating the image fe2 get ami base state pending As you’ll see this is using the method calling functionality of fastec2 that we saw earlier Launching and connecting to your instance Now you’ve gotten your AMI you can launch a new instance using that template It only take a couple of minutes for your new instance to be created as follows fe2 launch inst1 base 80 m5 large inst1 i 0f5a3b544274c645f running 18 191 111 211 We’re calling our new instance ‘inst1’ and using the ‘base’ AMI we created earlier As you can see the disk size and instance type need not be the same as you used when creating the AMI although the disk size can’t be smaller than the size you created with You can see all the options available for the launch command we’ll see how to use the iops and spot parameters in the next section fe2 launch help Usage fe2 launch NAME AMI DISKSIZE INSTANCETYPE KEYNAME SECGROUPNAME IOPS SPOT fe2 launch name NAME ami AMI disksize DISKSIZE instancetype INSTANCETYPE keyname KEYNAME secgroupname SECGROUPNAME iops IOPS spot SPOT Congratulations you’ve launched your first instance from your own AMI You can repeat the previous fe2 launch command just passing in a different name to create more instances and ssh to each with fe2 connect But what about the second issue should OpenAI release their pretrained model This one seems much more complex We’ve already heard from the “anti model release” view since that’s what OpenAI has published and also discussed with the media Catherine Olsson who previously worked at OpenAI asked on Twitter if anyone has yet seen a compelling explanation of the alternative view What have been your favorite on the merits pro release OpenAI GPT 2 takes on twitter or elsewhere I& 39 m looking for clear good faith explanation of the pro release or anti media attention position right now not clever snark Best practices should be identified in research areas with more mature methods for addressing dual use concerns such as computer security and imported where applicable to the case of AI Actively seek to expand the range of stakeholders and domain experts involved in discussions of these challenges An important point here is that an appropriate analysis of potential malicious use of AI requires a cross functional team and deep understanding of history in related fields I agree So what follows is just my one little input to this discussion I’m not ready to claim that I have the answer to the question “should OpenAI have released the model” I will also try to focus on the “pro release” side since that’s the piece that hasn’t had much thoughtful input yet A case for releasing the model OpenAI said that their release strategy is Due to concerns about large language models being used to generate deceptive biased or abusive language at scale we are only releasing a much smaller version of GPT 2 along with sampling code So specifically we need to be discussing scale Their claim is that a larger scale model may cause significant harm without time for the broader community to consider it Interestingly even they don’t claim to be confident of this concern This decision as well as our discussion of it is an experiment while we are not sure that it is the right decision today we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas Let’s get specific How much scale are we actually talking about I don’t see this explicitly mentioned in their paper of blog post but we can make a reasonable guess The new GPT2 model has according to the paper about ten times as many parameters as their previous GPT model Their previous model took 8 GPUs 1 month to train One would expect that they can train their model faster by now since they’ve had plenty of time to improve their algorithms but on the other hand their new model probably takes more epochs to train Let’s assume that these two balance out so we’re left with the difference of 10x in parameters If you’re in a hurry and you want to get this done in a month then you’re going to need 80 GPUs You can grab a server with 8 GPUs from the AWS spot market for 7 34 hour That’s around 5300 for a month You’ll need ten of these servers so that’s around 50k to train the model in a month OpenAI have made their code available and described how to create the necessary dataset but in practice there’s still going to be plenty of trial and error so in practice it might cost twice as much If you’re in less of a hurry you could just buy 8 GPUs With some careful memory handling e g using Gradient checkpointing you might be able to get away with buying RTX 2070 cards at 500 each otherwise you’ll be wanting the RTX 2080 ti at 1300 each So for 8 cards that’s somewhere between 4k and 10k for the GPUs plus probably another 10k or so for a box to put them in with CPUs HDDs etc So that’s around 20k to train the model in 10 months again you’ll need some extra time and money for the data collection and some trial and error Most organizations doing AI already have 8 or more GPUs available and can often get access to far more e g AWS provides up to 100k credits to startups in its AWS Activate program and Google provides dozens of TPUs to any research organization that qualifies for their research program So in practice the decision not to release the model has a couple of outcomes It’ll probably take at least a couple of months before another organization has successfully replicated it so we have some breathing room to discuss what to do when this is more widely available Small organizations that can’t afford to spend 100k or so are not able to use this technology at the scale being demonstrated Point 1 seems like a good thing If suddenly this tech is thrown out there for anyone to use without any warning then no one can be prepared at all In theory people could have been prepared because those within the language modeling community have been warning of such a potential issue but in practice people don’t tend to take it seriously until they can actually see it happening This is what happens for instance in the computer security community where if you find a flaw the expectation is that you help the community prepare for it and only then do you release full details and perhaps an exploit When this doesn’t happen it’s called a zero day attack or exploit and it can cause enormous damage I’m not sure I want to promote a norm that zero day threats are OK in AI On the other hand point 2 is a problem The most serious threats are most likely to come from folks with resources to spend 100k or so on for example a disinformation campaign to attempt to change the outcome of a democratic election In practice the most likely exploit is in my opinion a foreign power spending that money to dramatically escalate existing disinformation campaigns such as those that have been extensively documented by the US intelligence community The only practical defense against such an attack is as far as I can tell to use the same tools to both attempt to identify and push back against such disinformation These kinds of defenses are likely to be much more powerful when wielded by the broader community of those impacted The power of a large group of individuals has repeatedly been shown to be more powerful at creating than at destruction as we see in projects such as Wikipedia or open source software In addition if these tools aren’t in the hands of people without access to large compute resources then they remain abstract and mysterious What can they actually do What are their constraints For people to make informed decisions they need to have a real understanding of these issues Conclusion So should OpenAI release their trained model Frankly I don’t know There’s no question in my mind that they’ve demonstrated something fundamentally qualitatively different to what’s been demonstrated before despite not showing any significant algorithmic or theoretic breakthroughs And I’m sure it will be used maliciously it will be a powerful tool for disinformation and for influencing discourse at massive scale and probably only costs about 100k to create By releasing the model this malicious use will happen sooner But by not releasing the model there will be fewer defenses available and less real understanding of the issues from those that are impacted Those both sound like bad outcomes to me Five Things That Scare Me About AI 29 Jan 2019 Rachel Thomas AI is being increasingly used to make important decisions Many AI experts including Jeff Dean head of AI at Google and Andrew Ng founder of Coursera and deeplearning ai say that warnings about sentient robots are overblown but other harms are not getting enough attention I agree I am an AI researcher and I’m worried about some of the societal impacts that we’re already seeing In particular these 5 things scare me about AI Algorithms are often implemented without ways to address mistakes AI makes it easier to not feel responsible AI encodes Before we dive in I need to clarify one point that is important to understand algorithms and the complex systems they are a part of can make mistakes These mistakes come from a variety of sources bugs in the code inaccurate or biased data approximations we have to make e g you want to measure health and you use hospital readmissions as a proxy or you are interested in crime and use arrests as a proxy These things are related but not the same misunderstandings between different stakeholders policy makers those collecting the data those coding the algorithm those deploying it how computer systems interact with human systems and more This article discusses a variety of algorithmic systems I don’t find debates about definitions particularly interesting including what counts as “AI” or if a particular algorithm qualifies as “intelligent” or not Please note that the dynamics described in this post hold true both for simpler algorithms as well as more complex ones 1 Algorithms are often implemented without ways to address mistakes After the state of Arkansas implemented software to determine people’s healthcare benefits many people saw a drastic reduction in the amount of care they received but were given no explanation and no way to appeal Tammy Dobbs a woman with cerebral palsy who needs an aid to help her to get out of bed to go to the bathroom to get food and more had her hours of help suddenly reduced by 20 hours a week transforming her life for the worse Eventually a lengthy court case uncovered errors in the software implementation and Tammy’s hours were restored along with those of many others who were impacted by the errors Observations of 5th grade teacher Sarah Wysocki’s classroom yielded positive reviews Her assistant principal wrote “ It is a pleasure to visit a classroom in which the elements of sound teaching motivated students and a positive learning environment are so effectively combined ” Two months later she was fired by an opaque algorithm along with over 200 other teachers The head of the PTA and a parent of one of Wyscoki’s students described her as “ One of the best teachers I’ve ever come in contact with Every time I saw her she was attentive to the children went over their schoolwork she took time with them and made sure ” That people are losing needed healthcare without an explanation or being fired without explanation is truly dystopian Headlines from the Verge and the Washington Post As I covered in a previous post people use outputs from algorithms differently than they use decisions made by humans Algorithms are more likely to be implemented with no appeals process in place Algorithms are often used at scale Algorithmic systems are cheap People are more likely to assume algorithms are objective or error free As Peter Haas said “ In AI we have Milgram’s ultimate authority figure ” referring to Stanley Milgram’s famous experiments showing that most people will obey orders from authority figures even to the point of harming or killing other humans How much more likely will people be to trust algorithms perceived as objective and correct There is a lot of overlap between these factors If the main motivation for implementing an algorithm is cost cutting adding an appeals process or even diligently checking for errors may be considered an “unnecessary” expense Cathy O’Neill who earned her math PhD at Harvard wrote a book Weapons of Math Destruction in which she covers how algorithms are disproportionately impacting poor people whereas the privileged are more likely to still have access to human attention in hiring education and more 2 AI makes it easier to not feel responsible Let’s return to the case of the buggy software used to determine health benefits in Arkansas How could this have been prevented In order to prevent severely disabled people from mistakenly losing access to needed healthcare we need to talk about responsibility Unfortunately complex systems lend themselves to a dynamic in which nobody feels responsible for the outcome The creator of the algorithm for healthcare benefits Brant Fries who has been earning royalties off this algorithm which is in use in over half the 50 states blamed state policy makers I’m sure the state policy makers could blame the implementers of the software When asked if there should be a way to communicate how the algorithm works to the disabled people losing their healthcare Fries callously said “ It’s probably something we should do Yeah I also should probably dust under my bed ” and then later clarified that he thought it was someone else’s responsibility This passing of the buck and failure to take responsibility is common in many bureaucracies As danah boyd observed “Bureaucracy has often been used to shift or evade responsibility Who do you hold responsible in a complex system ” Boyd gives the examples of high ranking bureaucrats in Nazi Germany who did not see themselves as responsible for the Holocaust boyd continues “Today’s algorithmic systems are extending bureaucracy ” Another example of nobody feeling responsible comes from the case of research to classify gang crime A database of gang members assembled by the Los Angeles Police Department and 3 other California law enforcement agencies was found to have 42 babies who were under the age of 1 when added to the gang database 28 were said to have admitted to being gang members Keep in mind these are just some of the most obvious errors we don’t know how many other people were falsely included When researchers presented work on using machine learning on this data to classify gang crimes an audience member asked about ethical concerns “ I’m just an engineer ” responded one of the authors I don’t bring this up for the primary purpose of pointing fingers or casting blame However a world of complex systems in which nobody feels responsible for the outcomes which can include severely disabled people losing access to the healthcare they need or innocent people being labeled as gang members is not a pleasant place Our work is almost always a small piece of a larger whole yet a sense of responsibility is necessary to try to address and prevent negative outcomes 3 AI encodes But isn’t algorithmic bias just a reflection of how the world is I get asked a variation of this question every time I give a talk about bias To which my answer is No our algorithms and products impact the world and are part of feedback loops Consider an algorithm to predict crime and determine where to send police officers sending more police to a particular neighhorhood is not just an effect but also a cause More police officers can lead to more arrests in a given neighborhood which could cause the algorithm to send even more police to that neighborhood a mechanism described in this paper on runaway feedback loops Bias is being encoded and even magnified in a variety of applications software used to decide prison sentences that has twice as high a false positive rate for Black defendents as for white defendents computer vision software from Amazon Microsoft and IBM performs significantly worse on people of color Research by Joy Buolamwini and Timnit Gebru found that commercial computer vision software performed significantly worse on women with dark skin Gendershades org Word embeddings which are a building block for language tools like Gmail’s SmartReply and Google Translate generate useful analogies such as Rome Italy Madrid Spain as well as biased analogies such as man computer programmer woman homemaker Machine learning used in recruiting software developed at Amazon penalized applicants who attended all women’s colleges as well as any resumes that contained the word “women’s ” Over 2 3 of the images in ImageNet the most studied image data set in the world are from the Western world USA England Spain Italy Australia Chart from No Classification without Representation by Shankar et al shows the origin of ImageNet photos 45 US 8 UK 6 Italy 3 Canada 3 Australia 3 Spain Since a Cambrian explosion of machine learning products is occuring the biases that are calcified now and in the next few years may have a disproportionately huge impact for ages to come and will be much harder to undo decades from now 4 Optimizing metrics above all else leads to negative outcomes Worldwide people watch 1 billion hours of YouTube per day yes that says PER DAY A large part of YouTube’s successs has been due to its recommendation system in which a video selected by an algorithm automatically begin playing once the previous video is over Unfortunately these recommendations are disproportionately for conspiracy theories promoting white supremacy climate change denial and denial of the mass shootings that plague the USA What is going on YouTube’s algorithm is trying to maximize how much time people spend watching YouTube and conspiracy theorists watch significantly more YouTube than people who trust a variety of media sources Unfortunately a recommendation system trying only to maximize time spent on its own platform will incentivize content that tells you the rest of the media is lying “ YouTube may be one of the most powerful radicalizing instruments of the 21st century ” Professor Zeynep Tufekci wrote in the New York Times Guillaume Chaslot is a former YouTube engineer turned whistleblower He has been outspoken about the harms caused by YouTube and he partnered with the Guardian and the Wall Street Journal to study the extremism and bias in YouTube’s recommendations Photo of Guillaume Chaslot from the Guardian article YouTube is owned by Google which is earning billions of dollars by aggressively introducing vulnerable people to conspiracy theories while the rest of society bears the externalized costs of rising authoritarian governments a resurgence in white supremacist movements failure to act on climate change even as extreme weather is creating increasing numbers of refugees growing distrust of mainstream news sources and a failure to pass sensible gun laws This problem is an example of the tyranny of metrics metrics are just a proxy for what you really care about and unthinkingly optimizing a metric can lead to unexpected negative results One analog example is that when the UK began publishing the success rates of surgeons heart surgeons began turning down risky but necessary surgeries to try to keep their scores as high as possible Returning to the account of the popular 5th grade teacher who was fired by an algorithm she suspects that the underlying reason she was fired was that her incoming students had unusually high test scores the previous year making it seem like their scores had dropped to a more average level after her teaching and that their former teachers may have cheated As USA education policy began over emphasizing student test scores as the primary way to evaluate teachers there have been widespread scandals of teachers and principals cheating by altering students scores in Georgia Indiana Massachusetts Nevada Virginia Texas and elsewhere When metrics are given undue importance attempts to game those metrics become common 5 There is no accountability for big tech companies Major tech companies are the primary ones driving AI advances and their algorithms impact billions of people Unfortunately these companies have zero accountability YouTube owned by Google is helping to radicalize people into white supremacy Google allowed advertisers to target people who search racist phrases like “black people ruin neighborhoods” and Facebook allowed advertisers to target groups like “jew haters” Amazon’s facial recognition technology misidentified 28 members of congress as criminals yet it is already in use by police departments Palantir’s predictive policing technology was used for 6 years in New Orleans with city council members not even knowing about the program much less having any oversight The newsfeed timeline recommendation algorithms of all the major platforms tend to reward incendiary content prioritizing it for users In early 2018 the UN ruled that Facebook had played a “determining role” in the ongoing genocide in Myanmar “ I’m afraid that Facebook has now turned into a beast ” said the UN investigator This result was not a surprise to anyone who had been following the situation in Myanmar People warned Facebook executives about how the platform was being used to spread dehumanizing hate speech and incite violence against an ethnic minority as early as 2013 and again in 2014 and 2015 As early as 2014 news outlets such as Al Jazeera were covering Facebook’s role in inciting ethnic violence in Myanmar One person close to the case said “ That’s not 20 20 hindsight The scale of this problem was significant and it was already apparent ” Facebook execs were warned in 2015 that Facebook could play the same role in Myanmar that radio broadcasts had played during the 1994 Rwandan genocide As of 2015 Facebook only employed 4 contractors who spoke Burmese the primary language in Myanmar Contrast Facebook’s inaction in Myanmar with their swift action in Germany after the passage of a new law which could have resulted in penalties of up to 50 million euros Facebook hired 1 200 German contractors in under a year In 2018 five years after Facebook was first warned about how they were being used to incite violence in Myanmar they hired “dozens” of Burmese contractors a fraction of their response in Germany The credible threat of a large financial penalty may be the only thing Facebook responds to While it can be easy to focus on regulations that are misguided or ineffective we often take for granted safety standards and regulations that have largely worked well One major success story comes from automobile safety Early cars had sharp metal knobs on dashboard that lodged in people’s skulls during crashes plate glass windows that shattered dangerously and non collapsible steering columns that would frequently impale drivers Beyond that there was a widespread belief that the only issue with cars was the people driving them and car manufactures did not want data on car safety to be collected It took consumer safety advocates decades to push the conversation to how cars could be designed with greater safety and to pass laws regarding seat belts driver licenses crash tests and the collection of car crash data For more on this topic Datasheets for Datasets covers cases studies of how standardization came to the electronics pharmaceutical and automobile industries and 99 Invisible has a deep dive on the history of car safety with parallels and contrasts to the gun ind