Get Quotes Process Flow

Declaration of the QuoteCategory Key Value Pair

QuoteCategory dictionary is defined with Key Value Pair.

  • Key is the field which is parsed against user speech input

  • Value is the field which is appended in URL to get the Quote,

QuotesCategory = { 'knowledge' : 'knowledge', 'love':'love', 'life':'life', 'inspirational':'inspirational', 'humor':'humor', 'philosophy':'philosophy', 'god':'god', 'truth':'truth', 'wisdom':'wisdom', 'poetry':'poetry', 'romance':'romance', 'death':'death', 'happiness':'happiness', 'hope':'hope', 'faith':'faith', 'writing':'writing', 'life lessons':'life-lessons', 'motivational':'motivational', 'religion':'religion ', 'relationships':'relationships', 'success':'success', 'spirituality':'spirituality', 'time':'time', 'science':'science' }

Identification of the Quote Category

Flow starts with identifyQuoteCategory to identify the category of the quote from the given list of key value pair in QuoteCategory

  • If the user request matches a key then getQuote flow is triggered

for key in QuotesCategory:
    if subStrCheck(transformedSpeech,key):
        tag = QuotesCategory[key]
        break

if tag == "":
    identifyQuoteCategorySynonmys(transformedSpeech,tag,aifriend)
else:
    getQuote(tag,aifriend)
  • Else then identifyQuoteCategorySynonmys flow is triggered to identify if user request for quote is similar to any pre-defined key's

for key in QuotesCategory:
    CategorySynonmys = getSynonmys(key)

    i = 0
    j = 'N'
    for key1 in CategorySynonmys:
        if subStrCheck(transformedSpeech,CategorySynonmys[i].replace("_"," ").replace("-"," ")):
            j='Y'
            break
        i = i+1        

    if j == 'Y':
        tag = QuotesCategory[key]
        break

identifyQuoteCategorySynonyms flow invokes getSynonyms function to identify the Synonyms of the pre-defined key value pair of categories.

  • It uses NLTK wordnet to get the list of Synonyms for a given key and thereafter return the Synonyms list to identifyQuoteCategorySynonmys to identify the Quotes category tag

for syn in wordnet.synsets(key):
    for l in syn.lemmas():
        synonyms.append(l.name())

synonymsuniquelist = []
for i in synonyms:
    if i not in synonymsuniquelist:
        synonymsuniquelist.append(i)

Getting the Random Quote by Web scraping for the Identified Quote category/tag

Once the Quote category (tag) is identified in either case direct parsing or synonyms using NLTK then getQuote flow is triggered to get the Quote.

A random page number is identified for the range 0 to 10 and thereafter scrape_website_quote is triggered to scrap all the quotes on the random page for the given tag.

randompagenum = random.randrange(0,10)
scrape_website_quote(randompagenum,tag)

scrape_website_quote function scraps all the Quotes on the given page for the given tag and appends into Quotes and Authors list.

  • It uses Requests package to get the webpage using URL.

  • Thereafter parses the webpage using BeautifulSoup package

  • In case no tag was identified by previous function then popular quotes are fetched.

page_num = str(page_number) #Convert the page number to a string
#URL = 'https://www.goodreads.com/quotes/tag/inspirational?page='+page_num #append the page number to complete the URL
URL = "https://www.goodreads.com/quotes/tag/"+tag+ "?page="+page_num #append the page number to complete the URL

webpage = requests.get(URL)  #Make a request to the website 
soup = BeautifulSoup(webpage.text, "html.parser") #Parse the text from the website
quoteText = soup.find_all('div', attrs={'class':'quoteText'}) #Get the tag and it's class

for i in quoteText:
    quote = i.text.strip().split('\n')[0]#Get the text of the current quote, but only the sentence before a new line
    author = i.find('span', attrs={'class':'authorOrTitle'}).text.strip()
    quotes.append(quote)
    authors.append(author)

Combined list is created to append the Quotes + Authors Combination for each Quote and thereafter max number of quotes are identified.

A random quote is selected out of this quotes scraped from the web and provided to user along with tag name, in case of no tag popular phrase is added.

for i in range(len(quotes)):
    combined_list.append(quotes[i]+'-'+authors[i])
    quotemaxlimit = quotemaxlimit+1

randomquote = random.randrange(1,quotemaxlimit)

Thereafter, Control is returned to NextStep flow.

Last updated