I have some tables that collect records on a daily basis amounting to 20+million for each table. The tables are very similar and consist of a few string fields. I run a nightly job that gets counts and creates reports for these tables. I am looking into running more detailed statistics and wondering what would be the best approach. The daily count job takes a few minutes on its own and I am looking for a way to compile stats quicker. For example here is the schema of one table:
TABLE A:
ID
PID
POSTDATE
FETCHDATE
CUSTOMERID
QUERY
Daily stats just involve a distinct count of the records based on PID. Now I would like to get counts for example: total number of records per query per customerid, total number of records per pid etc. If I run this once a day it will take a toll on the database that is continually being updated. Someone mentioned taking snapshots every hour but not sure how that will work. Is there perhaps a way to insert a record into the table and automatically insert into a duplicate table and use that table for stats? What if I wanted this data available in milliseconds to view in some web application grid or table? Some proposed the idea of mongodb but I do not want to invest time in learning a new software. How would google for example track their searches if they wanted to get top queries for a day and eventually over 7 days etc.
Any ideas appreciate