Rails / Postgres: Get percentage of availability - sql

I am using a Postgres database (CalendarEntry) in which I store entries with start_date and end_date values.
Now I want to check if I am available for a certain timespan (check_start, check_end):
count_entries = CalendarEntry.where('start_date <= ?', check_end).where('end_date >= ?', check_start).count
i_am_available = count_entries == 0
I want to change this behavior to get the percentage of availability.
How could I count the available days instead of the entries themselves?

Related

Counting number of events (protests) from GDELT database

My goal is to get the monthly number of protests in Mexico reported between the years 2004 and 2020. I am using Google BigQuery to get this data from the GDELT database.
My problem is that I am getting different results when running the same query on different tables.
select
GlobalEventID
,MonthYear
,ActionGeo_Long
,ActionGeo_Lat
from
gdelt-bq.full.events_partitioned -- Returns 34650 records
--gdelt-bq.gdeltv2.events_partitioned -- Returns 93551 records
where
_PARTITIONTIME >= TIMESTAMP('2004-01-01')
and _PARTITIONTIME <= TIMESTAMP('2020-12-31')
and EventRootCode = '14'
and ActionGeo_CountryCode = 'MX'
;
Can you tell me which table I should use and why the query results differ from each other?
According to the GDELT documentation, gdeltv2 contains more events, and is more up to date for recent years. However they may not have finished backpopulating it to 1979.
This query shows only 20340 of the 93563 event IDs existing in both tables, so for such a large time range you may get best results by using the v1 table before 2015, and the v2 table from 2015 onwards.
SELECT COUNT(*)
FROM gdelt-bq.gdeltv2.events_partitioned g2
JOIN gdelt-bq.full.events_partitioned g1 ON g1.GlobalEventID = g2.GlobalEventID
WHERE g2._PARTITIONTIME >= TIMESTAMP('2004-01-01')
AND g2._PARTITIONTIME <= TIMESTAMP('2020-12-31')
AND g2.EventRootCode = '14'
AND g2.ActionGeo_CountryCode = 'MX'
AND g1._PARTITIONTIME >= TIMESTAMP('2004-01-01')
AND g1._PARTITIONTIME <= TIMESTAMP('2020-12-31')
AND g1.EventRootCode = '14'
AND g1.ActionGeo_CountryCode = 'MX'

Django Query : how do I find the maximum time duration from start and end time fields?

Here is my Django model:
class Shift(models.Model):
worker = models.OneToOneField('Worker',null=True)
date = models.DateField(null=True)
shiftTime = models.CharField(max_length=10, default="N/A")
timeIn = models.TimeField(null=True)
timeOut = models.TimeField(null=True)
I need to find a worker who spent the maximum amount of time in the office in a given date range. How do I calculate the time duration from timeIn and timeOut field in Django query?
Edit: I don't want to use another attribute duration because that seems redundant. Is there any other way to do it than using raw query or duration attribute?
Django 1.10 introduced the ability to natively do date/time diffs through the ORM. This query would get you the longest shift:
from django.db.models import DurationField, ExpressionWrapper, F
longest_shift = Shift.objects.annotate(shift_length=ExpressionWrapper(
F('timeOut') - F('timeIn'),
output_field=DurationField()))\.
order_by('-shift_length').first()
You can add a filter for a specific date range as required by adding a filter() clause before annotate().
Iterate through the objects and get the maximum duration between the timeIn and timeOut.
def get_largest_time_diff():
shift_objs = Shift.objects.all()
time_diff = 0
if shift_objs:
time_diff = shift_objs[0].timeOut - shift_objs[0].timeIn
for tm in shift_objs[1:]:
if time_diff < tm.timeOut - tm.timeIn:
time_diff = tm.timeOut - tm.timeIn
return time_diff

Time based priority in Active Record Query

I have a table which has job listings, which when displayed are normally ordered by the created_at field descending. I am in the process of adding a "featured" boolean flag which would add the ability for customers to get more visibility to their job listing. I'd like to have the featured listings pinned to the top of the search results if the job is less than X days old. How would I modify by existing query to support this?
Jobs.where("expiration_date >= ? and published = ?", Date.today, true).order("created_at DESC")
Current query pulls back all current, published jobs, ordered by created_at.
Unlike some other databases (like Oracle) PostgreSQL has a fully functional boolean type. You can use it directly in an ORDER BY clause without applying a CASE statement - those are great for more complex situations.
Sort order for boolean values is:
FALSE -> TRUE -> NULL
If you ORDER BY bool_expressionDESC, you invert the order to:
NULL -> TRUE -> FALSE
If you want TRUE first and NULL last, use the NULLS LAST clause of ORDER BY:
ORDER BY (featured AND created_at > now() - interval '11 days') DESC NULLS LAST
, created_at DESC
Of course, NULLS LAST is only relevant if featured or created_at can be NULL. If the columns are defined NOT NULL, then don't bother.
Also, FALSE would be sorted before NULL. If you don't want to distinguish between these two, you are either back to a CASE statement, or you can throw in NULLIF() or COALESCE().
ORDER BY NULLIF(featured AND created_at > now() - interval '11 days'), FALSE)
DESC NULLS LAST
, created_at DESC
Performance
Note, how I used:
created_at > now() - interval '11 days'
and not:
now() - created_at < interval '11 days'
In the first example, the expression to the right is a constant that is calculated once. Then an index can be utilized to look up matching rows. Very efficient.
The latter cannot usually be used with an index. A value has to be computed for every single row, before it can be checked against the constant expression to the right. Don't do this if you can avoid it. Ever!
Not sure what you want to achieve here.
I guess you'll be paginating the results. If so, and you want to display featured jobs always on top, regardless of the page, then you should pull them from the DB separately. If you just want to display them on the first page, order by published like this :
Jobs.where("expiration_date >= ?", Date.today).order("published DESC, created_at DESC")
If you want to pull them separately :
#featured_jobs = Jobs.where("expiration_date >= ? and published = ?", Date.today, true).order("created_at DESC")
#regular_jobs = Jobs.where("expiration_date >= ? AND published = ?", Date.today, false).order("created_at DESC") #Paginate them too ... depends on the gem you're using

Sql where t.date attribute is less than current year

In my Rails 3 app, I'm attempting to do a find of current students by their school's name and their graduation date in relation to the current year. I can do a successful find for users without a graduation date (see below), but I want to search users who have a graduation date attribute less than - or greater than - the current year. FYI, I'm using PostgreSQL.
The fields I'm using are set up as follows:
t.string :high_school
t.date :hs_grad_year
Here's the find I have working currently:
<%= pluralize(Profile.where(:high_school => "#{#highschool.name}").where("hs_grad_year IS NOT NULL").count, "person") %>
There's a couple issues with your code (getting records from your database should be done in the controller, not in the view, and you're using high school names as foreign keys instead of an id field, for example), but to answer your question:
Profile.where %'
high_school = ? AND
EXTRACT(year FROM hs_grad_year) < EXTRACT(year FROM current_date)
',
#highschool.name
You want a .where("hs_grad_year < extract(year from current_date)") in there.
In Postgres, current_date is the current date (yyyy-mm-dd), and the extract function pulls just one part of that date out of it. You can read more about date and time functions in Postgres here.
Profile.
where(:high_school => #highschool.name).
where("hs_grad_year < ? OR hs_grad_year > ?", Date.today.year, Date.today.year)

How do I all the registered users on a day report

I have a table called users where I have two columns: name and created_at. created_at column column is of type datetime and it stores the datetime when this user was created.
I need to know the number of users created for a given date range. Let's say I ask give me user report between 1-nov-2010 and 30-nov-2010 . I need something like this
1-nov-2010: 2
2-nov-2010: 5
The problem I am running into is that created_at data has value upto second. How do I check if a created_at date falls within a given date.
Any help in solving this problem is appreciated.
I am using mysql5.
select date_format(created_at, '%e-%b-%Y'), count(*)
from users
where created_at >= '2010-11-01' and created_at < '2010-12-01'
group by date(created_at);
MySQL lets you do lots of date-ish things even with datetimes.
An alternative if computing the day after the end day is troublesome:
where date(created_at) between '2010-11-01' and '2010-11-30'