The Joys Of Programming

PragProg's Christmas-in-July sale is on:

https://media.pragprog.com/newslette...

Happy July 24, or as the gerbils like to call it, "Christmas in July!"

Use coupon code XMAS724 to save 40% on all ebooks, screencasts, and audio books (yes! we have audio books too!) from pragprog.com.

IMAGE(https://i.redd.it/uf6rjj174ef31.png)

I'm working on a fun little project right now that I had exactly zero background on when I started, but I've learned quite a bit! I'm running all the GWJ Conference Calls through a Speech-to-Text pipeline to generate transcripts and then put those through an LDA topic model to see the evolution of the podcast and topics over time.

Took me quite a while to get my Google Cloud account set up right and feed all the correct credentials but once I did I've been rolling along! So far I've spent almost $200 of my $400 in free GCP credits and processed over 5000 minutes of podcast. I'll definitely continue to document my progress as I go and share the results in a forum thread once I get them. At the current pace it's going to take another 5 days of processing and at least 4 more free Google trials with alternate emails to get everything parsed.

That’s awesome! I’m keen to see some results.

staygold wrote:

I'll definitely continue to document my progress as I go and share the results in a forum thread once I get them.

I'm looking forward to this Awesome project

CPWilson wrote:
staygold wrote:

I'll definitely continue to document my progress as I go and share the results in a forum thread once I get them.

I'm looking forward to this Awesome project

I can only hope to approach the level of your awesome infographics!

Update 1: I'm through nearly 100 podcasts for speech to text. Running some intermediate descriptive statistics:

  • 86 podcasts
  • 115 hours, 11 minutes, 28 seconds, 93 milliseconds of podcast time
  • 815 895 spoken words (+/- about 15%, the accuracy isn't superb on the speech-to-text)
  • 3 965 324 characters in the words
  • $222.89 CDN in compute cost
  • $28.99 CDN in federal/provincial taxes
  • 11 693 mentions of "game" or "games"
  • 31 mentions of "nerd" or "nerds"
  • 8 mentions of "booth babes"

This afternoon I started and finished my web crawler to go through and download all the podcasts so I am now the unofficial historian and keeper of GWJ podcast history (except episodes 3 - 52 and 55 which I think rabbit had locally hosted and aren't available any more).

I found a slick solution using R to convert .mp3 files to .wav so I'm running that, using Python to talk to Google Cloud, and just fine tuning my topic modelling algorithm which is a lift a shift from a work project so hopefully will have some early results for the first 100 or so podcasts by end of week.

P.S. I have no classical training in coding (Bachelor in Chemical Engineering, Masters in Business Management) so if I'm not proof you can teach anyone to code, I don't know what more you need!

How many mentions of "Legion" though?

*Legion* wrote:

How many mentions of "Legion" though?

Legion? Omnipresent, of course.

*Legion* wrote:

How many mentions of "Legion" though?

Legion > Booth Babes
(12 mentions of Legion)

Your code must be buggy. No way am I losing to "nerds"!

IMAGE(https://media.giphy.com/media/8SjVrE66V6WVG/source.gif)