The Naked Scientists
  • Login
  • Register
  • Podcasts
      • The Naked Scientists
      • eLife
      • Naked Genetics
      • Naked Astronomy
      • In short
      • Naked Neuroscience
      • Ask! The Naked Scientists
      • Question of the Week
      • Archive
      • Video
      • SUBSCRIBE to our Podcasts
  • Articles
      • Science News
      • Features
      • Interviews
      • Answers to Science Questions
  • Get Naked
      • Donate
      • Do an Experiment
      • Science Forum
      • Ask a Question
  • About
      • Meet the team
      • Our Sponsors
      • Site Map
      • Contact us

User menu

  • Login
  • Register
  • Home
  • Help
  • Search
  • Tags
  • Member Map
  • Recent Topics
  • Login
  • Register
  1. Naked Science Forum
  2. Non Life Sciences
  3. Geek Speak
  4. What's the best database structure for storing results over time?
« previous next »
  • Print
Pages: [1]   Go Down

What's the best database structure for storing results over time?

  • 4 Replies
  • 252686 Views
  • 0 Tags

0 Members and 5 Guests are viewing this topic.

Offline chris (OP)

  • Naked Science Forum King!
  • ******
  • 7947
  • Activity:
    2.5%
  • Thanked: 273 times
  • The Naked Scientist
    • View Profile
    • The Naked Scientists
What's the best database structure for storing results over time?
« on: 19/01/2020 16:23:19 »
Dear all

I have written a function analyse the log data for our podcast and webpage downloads.

I want to store this in a database so that we can look up the total amount of traffic for a programme episode, or a webpage, over time but also log a day-by-day record of consumption.

What is the best database structure for this? There are thousands of webpages and podcasts. One easy way would be to enter a line in a table for each episode each day and record the total traffic for that day; but then extracting data for that episode would be very processor hungry wouldn't it? I'd need to pull all the entries for that url and then sum the activity.

Is there a better way to do this?
Logged
I never forget a face, but in your case I'll make an exception - Groucho Marx - https://www.thenakedscientists.com/
 



Offline PKorda

  • First timers
  • *
  • 1
  • Activity:
    0%
  • Naked Science Forum Newbie
    • View Profile
    • nude celebrities
Re: What's the best database structure for storing results over time?
« Reply #1 on: 29/01/2020 07:12:16 »
Why not use Google Analytics? Or a free self-hosted solution.
« Last Edit: 19/02/2020 22:29:37 by Colin2B »
Logged
 

Offline Hayseed

  • Sr. Member
  • ****
  • 350
  • Activity:
    16%
  • Thanked: 9 times
  • Naked Science Forum Crackpot
    • View Profile
Re: What's the best database structure for storing results over time?
« Reply #2 on: 31/01/2020 05:47:14 »
I have never run a website, but wouldn't the server logs have all of the time stamped data that you would need?

You just want numbers per time don't you?

One could plot all stamps to each page of the whole site.  Stamps per page per time.

Edit:  one could have another routine that looks up the IP addresses.  For geographic data......assuming not too many are using VPNs.
« Last Edit: 31/01/2020 05:52:15 by Hayseed »
Logged
The proper hardware will eliminate all theory.
 

Offline nicephotog

  • Sr. Member
  • ****
  • 427
  • Activity:
    0%
  • Thanked: 7 times
  • H h H h H h H h H h
    • View Profile
Re: What's the best database structure for storing results over time?
« Reply #3 on: 15/03/2020 02:37:32 »
You may have forgotten two things, tagging and binary itself.
With tagging you can bring things down to abberative, such as rather than simply "UTC time" or "UTC-time-date" that can be expressed as single number of compulsory character lengths(sizeof primitive data-type) , the "field" in the "database table" can have the size set to take such "data" leaving each piece extractable.(utc time date can be written as a number only of a particular length as a "string/char array" known as "human readable" in the UTC format) so would be a small text DB data type. (note: it is/was possible to obtain EXACT instant UTC time date over the net but you must acquire it only once and write your own converter and ticker program to operate constantly for other programs on your computer, the usual is actuially to set the clock of the server with that UTC call by extracting the information from the format , then obtain all the information in CGI program call instances and convert it to UTC)

In short, as indexing the pieces of a program for the processor memory location, you do not need thhe human readable or machine readable info, you only need to index to a set number between 0 - 255 inclusive a representation of the information in the fields' columns.
e.g. in the "media_type" column , you can set its fields to a "binary int" and "not null" rule where 1 = page , 2 = audio 3 = pic 4  = video
AND e.g.
protocol 1 = http , 2 =https 3 = ssl
AND more complex
"not null" "int" in a "url_name" column simply list a unique number for each specific url in the site.
These numbers returned from the database in a query are simply translated by the CGI program written for them.
That technique passes much lest repetitive data.
« Last Edit: 15/03/2020 02:48:20 by nicephotog »
Logged
 

Offline chris (OP)

  • Naked Science Forum King!
  • ******
  • 7947
  • Activity:
    2.5%
  • Thanked: 273 times
  • The Naked Scientist
    • View Profile
    • The Naked Scientists
Re: What's the best database structure for storing results over time?
« Reply #4 on: 18/02/2021 00:20:45 »
I'm circling back to describe how I completed this project, just to close the loop. It took a while because Covid-19 hijacked my life for 12 months, but, craving distraction, I returned to the challenge a few weeks back and finished it.

The challenge was that I am logging consumption of audio files from the site on a daily basis. This means that I need to log thousands of different filenames every day, and my challenge was how to do this in a sane way that was easy to interrogate and efficient to store.

The solution was to write a filter to extract from the logs the relevant entries each day and de-duplicate them (to avoid double-counting) and file this in a database.

I created a pair of tables, one for filenames of the downloaded files and the second for dates and downloads.

Basically, I insert each unique filename into the download table where it is assigned a unique ID number. Then, for each day, I insert a row for the date and the id number of the downloaded file on that date as well as how many copies of that file id were downloaded on that day.

It means I can easily sum total downloads for a file, or daily totals for files etc.

Thanks for those here who made suggestions for how to do this.
« Last Edit: 18/02/2021 00:23:03 by chris »
Logged
I never forget a face, but in your case I'll make an exception - Groucho Marx - https://www.thenakedscientists.com/
 



  • Print
Pages: [1]   Go Up
« previous next »
Tags:
 

Similar topics (5)

Must ∞ monkeys on ∞ typewriters really write everything given ∞ time?

Started by chiralSPOBoard General Science

Replies: 28
Views: 24761
Last post 28/03/2020 11:42:26
by yor_on
We Know The Extent Of The Sun, What Is The Extent Of Space Time?

Started by TitanscapeBoard Physics, Astronomy & Cosmology

Replies: 2
Views: 11149
Last post 27/04/2008 23:10:10
by turnipsock
What does "time-like" mean in the following sentence?

Started by scheradoBoard Physics, Astronomy & Cosmology

Replies: 15
Views: 9342
Last post 09/02/2018 10:28:21
by Colin2B
If you could travel faster than light, could you travel in time?

Started by DmaierBoard Technology

Replies: 13
Views: 14315
Last post 19/03/2020 14:56:52
by Paul25
If the speed of light is constant, time must be constant too?

Started by Chuck FBoard General Science

Replies: 4
Views: 11907
Last post 19/03/2020 14:51:12
by Paul25
There was an error while thanking
Thanking...
  • SMF 2.0.15 | SMF © 2017, Simple Machines
    Privacy Policy
    SMFAds for Free Forums
  • Naked Science Forum ©

Page created in 0.16 seconds with 46 queries.

  • Podcasts
  • Articles
  • Get Naked
  • About
  • Contact us
  • Advertise
  • Privacy Policy
  • Subscribe to newsletter
  • We love feedback

Follow us

cambridge_logo_footer.png

©The Naked Scientists® 2000–2017 | The Naked Scientists® and Naked Science® are registered trademarks created by Dr Chris Smith. Information presented on this website is the opinion of the individual contributors and does not reflect the general views of the administrators, editors, moderators, sponsors, Cambridge University or the public at large.