issue in indexing solr data - indexing

Issue is how to index this data properly so that querying can be possible as well.
I need to get the data of events on the basis of date, day of week and time slot. My primary id is event id.
Table1-
select event_id, from_date, to_date from event_listing. event_id is primary key.
Table2
select dayOfWeek, slot_start_time, slot_end_time from slot_data where event_id = $eventId
I need to index these data in solr, like - I'll get only one record for one particular event.
Once data is indexed, I need to query this data on the basis of- time slots and dayOfweek.
Let me know the suitable way to get this.

Related

Delete based on column values when it is part of a composite key

I have a table which has an id and a date. (id, date) make up the composite key for the table.
What I am trying to do is delete all entries older than a specific date.
delete from my_table where date < '2018-12-12'
The query plan explains that it will do a sequential scan for the date column.
I somehow want to make use of the index present since the number of distinct ids are very very small compared to total rows in the table.
How do I do it ? I have tried searching for it but to no avail
In case your use-case involves data-archival on monthly basis or some time period, you can think of updating your DataBase table to use partitions.
Let's say you collect data on monthly basis and want to keep data for the last 5 months. It would be really efficient to create partition over the table based on month of the year.
This will,
optimise your READ queries (table scans will reduce to partition scans)
optimise your DELETE requests (just delete the complete partition)
You need an index on date for this query:
create index idx_mytable_date on mytable(date);
Alternatively, you can drop your existing index and add a new one with (date, id). date needs to be the first key for this query.

php myadmin time and date for hestorical records

I'm developing an application for managing delivery in a company using netbeans and php myadmin.
I have to save in the database daily hundreds of deliveries with specific data for each but all with the date of that day, for query later like
select * from table_1 where 'date'='02/10/2016' for example.
I can create a field in the table with type "date" but this date will be redundancy hundreds of times in the table just to specify one day, and the next day also and so on...
What's the best way to stop the redundancy ??
You could use datetime as its type which will reduce your redundancy.
And when you wish to retrive all entries of a particular date, you could try using select * from table_1 where 'date' LIKE '2016-10-02%' in your query

How to maintain a list of N most recent viewed items per user in a relational database

I would like to keep track of the last n items that a user has viewed in a PostgreSQL database. My first thought is to create a table such as
CREATE TABLE history (
id SERIAL PRIMARY KEY,
user_id integer REFERENCES users (id),
item_id integer REFERENCES items (id),
view_date timestamp DEFAULT current_timestamp
);
When a user views an object, a new row in the history table will record this view. But I only need to maintain the last n views for each user, and this approach will store every view that ever occurs.
Is there an efficient way to periodically drop all users' entries that are in excess of their n most recent?
EDIT: If there's a better way to store this data than using a SQL table, I'd be interested to hear about that.
delete from history
where id in (
select id
from (
select
id,
row_number() over(
partition by user_id
order by view_date desc
) as rn
from history
) s
where rn > n
)
Is there an efficient way to periodically drop all users' entries that are in excess of their n most recent?
Set up a job that groups, orders and drops every ten minutes or so. You aren't going to find a lot of room for improvement in that sort of query.
From a design perspective though I would favor creating an in memory data structure which you load/save at the start/end of the user's session. That way you don't beat up your database with this sort of work. But your requirements may make this strategy impossible.
Cheers!
If there's a better way to store this data than using a SQL table, I'd
be interested to hear about that.
Database is for persistence of values / object states in a fairly long period of time. If you need frequent access / update of the most recent items, use a cache.
You can listen to the cache notification, when the list expires or is evicted, capture , serialize and save it to database.
http://msdn.microsoft.com/en-us/library/ee808091(v=azure.10).aspx

Calculate date and time key in fact table using existing date time field

I have date time field in a fact table in the format MM/DD/YY HH:MM:SS (e.g 2/24/2009 11:18:47 AM) and I have seperate date and time dimension
tables. What I would like ask is that how I can create date key and time key in the fact table using the date time field so that I can join the
date and time dimension.
There are alot of reference for creating seperate date and time dimension and their benefit but I could not find how to create date and time keys
in the fact table using existing date time field.
I have also heard that having date time field in fact table has certain benefit. If so, what would you recommend, should I have all three (date key, time key
and date time field) in the fact table. Date key and time key are must to have for me and I am concerned about fact table size if I have date time field
also in the fact table.
Thank you all for any help you can give.
What you need to do (if I understand correctly) is to create two fields in your Fact table:kTime, kDate.
We would always suggest using the primary keys for DimTime and DimDate as having meaning (this being a special case normally Dim tables' promary keys dont have any meaning). So e.g. in DimDate, we would have kDate as primary key with values formatted as YYYYMMDD so that you can order by kDate and it puts them in date order. Then have DimTime table having kTime primary key in the form HHMM or HHMMSS (depending on the resolution you need.
It is best to keep the actual date time field on the fact table as well, as it allows SQL to use its inbuilt date/time functions to do subsetting, but if you extend your Dim tables with useful extra columns : DimDate (add DayOfWeek, IsHoliday, DayNumber, MonthNumber, YearNumber, etc) and DimTime (HourNumber, MinuteNumber, IsWorkingTime), then you can perform very interesting queries very simply.
So to answer your question, "how to create date and time keys in the fact table using existing date time field?" ... as you are loading the data into the fact table, use the inbuilt date/time functions to create separate Date fields and Time fields.
It depends very much on how many rows you expect in your Fact table wether this approach will produce a lot of data, but it is the easiest to work with from a data warehouse point of view.
best of luck!

What is the best way to structure Days of the week in a db

This is a normalization thing, but I want I have to hold information about the days of the week. Where the user is going to select each day and put a start time and a finish time. I need this info to be stored in a db. I can simply add 14 fields to the table and it will work (MondayStart,MondayFinish,TuesdayStart, etc). This doesnt seem
Do NOT design your database to match the UI.
My time keeping system at my job has a place to enter data for each day of the week. That doesn't mean you store it that way.
You need a table for users and one for times
User_T
User_ID
Time_log_T
User_ID
Start_dt (datetime)
End_dt (Datetime)
Everything can be derived from this.
If you want to have one check-in per day create a unique constraint on User_ID, TRUNC(start_DT). This will handle third shift that wrap days. RDBMS cannot express that the next start_dt for a given User_ID is > MAX(End_DT) for that user... you'll have to do that in code. Of course if you allow records from previous days to be entered or corrected you'll need to validate them to be non-overlapping in a more complex style.
Think of all the queries you'd throw at these tables; This will beat the 14 columns 99% of the time.
Users
id
...etc...
Days
id
day nvarchar (Monday, Tuesday, etc)
start_time datetime
end_time datetime
user_id
you could also break out day in Days to a day of week to enforce consistency on the day if you only want to allow specific days or what not so Days would become
Days
id
day_of_week_id
...etc...
DaysOfWeek
id
name
I don't think moving the data to another table would accomplish anything. There would still be a one-to-one (main record to 14 fields) relationship. It would be more complex and run slower.
Your instincts are good but in this case I think you would be better off leaving the data in the table. Over-normalization is a bad thing.
You could create a table with 3 columns -- one for the day (this would be the primary key), one for the start time, and one for the finish time.
You would then have one row for each day of the week.
You could extend it with, say, a column for a user id, if you are storing the start and finish time for each user on each day (in this case, the primary key would be user id and day of the week)... or something similar to suit your needs.