How I built an AI app with Heroku

Spencer Holley
5 min readJan 10, 2021

In this article I will take you through the end to end process I used to build a web app that classifies an article as being AI generated or written by a real person. Click here to see my repo.

Inspiration

Wikipedia is an amazing place full of free information maintained by a community of volunteer editors. It has made giving and receiving knowledge very easy. Unfortunately, with this great availability of knowledge comes the potential spread of misinformation. With the rise of complex transformer models such as GPT2, AI can generate persuasive content that is practically identical to human written text. We can train a Machine learning model to classify whether a body of text has been written by a human or an AI.

Web Scraping

Here are some of these random topics

I scraped 1,000 Wikipedia articles to use as training data for GPT2. I generated a list of 1,000 topics by using an online random topic generator and pasting all the topics into a txt document. Then I scraped the Wikipedia article available under each topic. Unfortunately, I had to manually tweak some topic names to get them to match the real title of the Wikipedia page. I saved the articles into a csv file, for the classifier’s training data and a txt file for the GPT2 Training data.

More on Gpt2

GPT2 is a Language generator that was released by OpenAI in 2018 but was slow to release due to safety concerns, GPT stands for generative pre-trained and 2 because GPT2 is the second generative-pretrained model . It uses an encoder/decoder architecture with self attention that outperforms the recurrent models traditionally used for NLP. The Generative model can generate any kind of text when fine tuned on texts of similar length, for example, if generating tweets you should train it on tweet length bodies of text. The model is known for generating a blog arguing that recycling is bad for the environment!

Tuning GPT2 and generating articles

Here’s a little sneak peak of the gpt2 training process

I used GPT2 via the gpt2 simple library. I had some frustrations with getting it to work. To finally get it to work I downgraded to tensorflow 1.15 because tensorflow 2.x versions don’t have tf.contrib which is a dependency for gpt2 simple, I also had to mount google drive.

Next, I loaded in the txt file of all the articles, selected which version of GPT2 to use, and set the number of steps. I chose the 355 million parameter version because it is advised for datasets with over 10MB of data and I set steps to 10,400 because loss would get down to 1.1 or 1.2 which was low enough for the articles to be fully coherent but high enough to still seem different from the real articles. Finally, I generated 180 articles and saved them to a csv file, I only generated 180 because of resource constraints.

Building the classifier

The model properly classified 94% of the AI generated text and 84% of the real articles.

For the classifier dataset I took 180 randomly selected articles from the csv file and merged them with the generated articles, this way there was no class imbalance. After tokenizing all the articles, I built an RNN building function and then wrapped it into KerasClassifier() so I could run a grid search to find the best model parameter combination. In the end, the best model was a Bidirectional LSTM with a tanh activation, an rmsprop optimizer, and 64 nodes in the LSTM layer. There was an accuracy of 97% training and 83% testing, a recall of 100%, and a precision of 67%. This meant the model was better at classifying AI generated than Human generated. This is the better problem to have because the model is supposed to flag AI written text.

The grid search took about 14 hours to complete, the process was quite long and tedious. The vast majority of the models performed really bad, only a few of them were any good at all. But ultimately it was worth it because I ended up with a model that performed well and was able to do a decent job at classifying outside data.

Deploying to Heroku

I deployed this model to a web app so that the public could actually use this. This video by Krish Naik was a lifesaver and I used his app.py as a template for my app.py. In order to create the app I went through the following steps: save the model to my local environment, create an HTML template for the app, write a flask app script and insert the model, create configuration files, push all of these things to a repo, and deploy the repo. I will not that It took a while to get the deployment to work, big thanks to my bootcamp instructor for helping me figure it out. I ultimately had to downgrade the app to using tensorflow 2.3.2, from 2.4.0, to get it to work. The finished product was an app in which a user could paste in an essay and press a “predict” button to get the probability that the text is AI generated, it works best on essays with 1600–2000 words because that’s how long the articles the model was trained on are.

Click here to demo

After scraping Wikipedia, generating articles with GPT2, building a classification model, saving the best model to my local environment, wrapping the model into a flask app, and deploying the app to Heroku, I now have a web app that classifies text as AI or Human Generated.

--

--