SQL Query converting to Rails Active Record Query Interface - sql

I have been using sql queries in my rails code which needs to be transitioned to Active Record Query. I haven't used Active Record before so i tried going through http://guides.rubyonrails.org/active_record_querying.html to get the proper syntax to be able to switch to this method of getting the data. I am able to convert the simple queries into this format but there are other complex queries like
SELECT b.owner,
Sum(a.idle_total),
Sum(a.idle_monthly_usage)
FROM market_place_idle_hosts_summaries a,
(SELECT DISTINCT owner,
hostclass,
week_number
FROM market_place_idle_hosts_details
WHERE week_number = '#{week_num}'
AND Year(updated_at) = '#{year_num}') b
WHERE a.hostclass = b.hostclass
AND a.week_number = b.week_number
AND Year(updated_at) = '#{year_num}'
GROUP BY b.owner
ORDER BY Sum(a.idle_monthly_usage) DESC
which i need in Active Record format but because of the complexity I am stuck as to how to proceed with the conversion.
The output of the query is something like this
+----------+-------------------+---------------------------+
| owner | sum(a.idle_total) | sum(a.idle_monthly_usage) |
+----------+-------------------+---------------------------+
| abc | 485 | 90387.13690185547 |
| xyz | 815 | 66242.01857376099 |
| qwe | 122 | 11730.609939575195 |
| asd | 80 | 9543.170425415039 |
| zxc | 87 | 8027.090087890625 |
| dfg | 67 | 7303.070011138916 |
| wqer | 76 | 5234.969814300537 |

Instead of converting it to an active record, you can use the find_by_sql method. Since your query is a bit complex.
You can use also use ActiveRecord::Base.connection, directly to fetch the records.
like this,
ActiveRecord::Base.connection.execute("your query")

You can create the subquery apart with ActiveRecord and convert it to sql using to_sql
Then use joins to join your table a with the b one, that it is the subquery. Note also the use of the active record clauses select, where, group and order that are basically what you need to build this complex SQL query in ActiveRecord.
Something similar to the following will work:
subquery = SubModel.select("DISTINCT ... ").where(" ... ").to_sql
Model.select("b.owner, ... ")
.joins("JOIN (#{subquery}) b ON a.hostclass = b.hostclass")
.where(" ... ")
.group("b.owner")
.order("Sum(a.idle_monthly_usage) DESC")

Related

How to get column names from a query?

I have a specific query with joins and aliases, and I need to retrieve columns name for a REST request in Talend.
I'm using Talend Open Studio for Data Integration 6.2 and I've got an Oracle 11g database with a read-only account. I can execute scripts with Talend, For example the query:
select
u.name as "user",
f.name as "food",
e.rate
from
Users as u
join Eval as e on u.user_id = e.user_id
join Food as f on e.food_id = f.food_id
where
1 = 1
should give the following result:
+------+--------+------+
| user | food | rate |
+------+--------+------+
| Baba | Donuts | 16.0 |
| Baba | Cheese | 20.0 |
| Keke | Pasta | 12.5 |
| Keke | Cheese | 15.0 |
+------+--------+------+
And I try to get the columns (in the right order) as follows by using scripts or Talend:
+--------+
| Column |
+--------+
| user |
| food |
| rate |
+--------+
Is there a way to query the Oracle database to get the columns or using talend to retrieve them?
UPDATE
Thanks to Marmite Bomber, a duplicate has been identified here for the Oracle approach. Now we need a Talend approach to the problem.
You can try this on a tJavaRow, following your DBInput component :
for (java.lang.reflect.Field field: row1.getClass().getDeclaredFields()) {
context.columnName = field.getName();
System.out.println("Field name is " + context.columnName );
}
Spotted on talend help center here : https://community.talend.com/t5/Design-and-Development/resolved-how-to-get-the-column-names-in-a-data-flow/td-p/99172
You can extend this, and put the column list on your outputflow :
//add this inside the loop, and 'columnNames' as an output row in tJavaRow schema
output_row.columnNames+=context.columnName+";";
With a tNormalize after tJavaRow, you shoud get the expected result.
Here´s a link to an oracle community thread which should answer your question
community.oracle.com
I am not able to write a comment, so posting this as an answer:
SELECT column_name
FROM all_tab_cols
WHERE table_name = 'table_name_here'

Is it possible to do complex SQL queries using Django?

I have the following Script to get a list of calculated index for each day after specific date:
with test_reqs as (
select id_test, date_request, sum(n_requests) as n_req from cdr_test_stats
where
id_test in (2,4) and -- List of Ids included in index calc
date_request >= 20170823 -- Start date (end date -> Last in DB -> Today)
group by id_test, date_request
),
date_reqs as (
select date_request, sum(n_req) as n_req
from test_reqs
group by date_request
),
test_reqs_ratio as (
select H.id_test, H.date_request,
case when D.n_req = 0 then null else H.n_req/D.n_req end as ratio_req
from test_reqs H
inner join date_reqs D
on H.date_request = D.date_request
),
test_reqs_index as (
select HR.*, least(nullif(HA.n_dates_hbalert, 0), 10) as index_hb
from test_reqs_ratio HR
left join cdr_test_alerts_stats HA
on HR.id_test = HA.id_test and HR.date_request = HA.date_request
)
select date_request, 10-sum(ratio_req*index_hb) as index_hb
from test_reqs_index
group by date_request
Result:
---------------------------
| date_request | index_hb |
---------------------------
| 20170904 | 7.5508 |
| 20170905 | 7.6870 |
| 20170825 | 7.4335 |
| 20170901 | 7.7116 |
| 20170824 | 1.6568 |
| 20170823 | 0.0000 |
| 20170903 | 5.1850 |
| 20170830 | 0.0000 |
| 20170828 | 0.0000 |
---------------------------
The problem is that I want to get the same in Django and avoid to execute the raw query using the cursor.
Many thanks for any suggestion.
Without going deep into the specifics of your query, I'd say the Django ORM has enough expressiveness to handle most problems, but generally, would require you to redesign the query from the ground up. You would have to use subqueries and joins instead of the CTE's, and you might end up with a solution that does some of the work in Python land instead of the DB.
Taking this into account the answer is: depends. Your functional requirements, such as performance and data size play a role.
Another solution worth considering is declaring your SQL query as a view, and at least in the case of Postgres, use something like django-pgviews to query it with Django ORM almost as if it were a model.

SQL join two tables using value from one as column name for other

I'm a bit stumped on a query I need to write for work. I have the following two tables:
|===============Patterns==============|
|type | bucket_id | description |
|-----------------------|-------------|
|pattern a | 1 | Email |
|pattern b | 2 | Phone |
|==========Results============|
|id | buc_1 | buc_2 |
|-----------------------------|
|123 | pass | |
|124 | pass |fail |
In the results table, I can see that entity 124 failed a validation check in buc_2. Looking at the patterns table, I can see bucket 2 belongs to pattern b (bucket_id corresponds to the column name in the results table), so entity 124 failed phone validation. But how do I write a query that joins these two tables on the value of one of the columns? Limitations to how this query is going to be called will most likely prevent me from using any cursors.
Some crude solutions:
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_1" = 'fail' AND "bucket_id" = 1
union all
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_2" = 'fail' AND "bucket_id" = 2
Or, with a very probably better execution plan:
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_1" = 'fail' AND "bucket_id" = 1
OR "buc_2" = 'fail' AND "bucket_id" = 2;
This will report all failure descriptions for each id having a fail case in bucket 1 or 2.
See http://sqlfiddle.com/#!4/a3eae/8 for a live example
That being said, the right solution would be probably to change your schema to something more manageable. Say by using an association table to store each failed test -- as you have in fact here a many to many relationship.
An other approach if you are using Oracle ≥ 11g, would be to use the UNPIVOT operation. This will translate columns to rows at query execution:
select * from Results
unpivot ("result" for "bucket_id" in ("buc_1" as 1, "buc_2" as 2))
join Patterns
using("bucket_id")
where "result" = 'fail';
Unfortunately, you still have to hard-code the various column names.
See http://sqlfiddle.com/#!4/a3eae/17
It looks to me that what you really want to know is the description(in your example Phone) of a Pattern entry given the condition that the bucket failed. Regardless of the specific example you have you want a solution that fulfills that condition, not just your particular example.
I agree with the comment above. Your bucket entries should be tuples(rows) and not arguments, and also you should share the ids on each table so you can actually join them. For example, Consider adding a bucket column and index their number then just add ONE result column to store the state. Like this:
|===============Patterns==============|
|type | bucket_id | description |
|-----------------------|-------------|
|pattern a | 1 | Email |
|pattern b | 2 | Phone |
|==========Results====================|
|entity_id | bucket_id |status |
|-------------------------------------|
|123 | 1 |pass |
|124 | 1 |pass |
|123 | 2 | |
|124 | 2 |fail |
1.-Use an Inner Join: http://www.w3schools.com/sql/sql_join_inner.asp and the WHERE clause to filter only those buckets that failed:
2.-Would this example help?
SELECT Patterns.type, Patterns.description, Results.entity_id,Results.status
INNER JOIN Results
ON
Patterns.bucket_id=Results.bucket_id
WHERE
Results.status=fail
Lastly, I would also add a primary_key column to each table to make sure indexing is faster for each unique combination.
Thanks!

SQL Query: Search with list of tuples

I have a following table (simplified version) in SQLServer.
Table Events
-----------------------------------------------------------
| Room | User | Entered | Exited |
-----------------------------------------------------------
| A | Jim | 2014-10-10T09:00:00 | 2014-10-10T09:10:00 |
| B | Jim | 2014-10-10T09:11:00 | 2014-10-10T09:22:30 |
| A | Jill | 2014-10-10T09:00:00 | NULL |
| C | Jack | 2014-10-10T09:45:00 | 2014-10-10T10:00:00 |
| A | Jack | 2014-10-10T10:01:00 | NULL |
.
.
.
I need to create a query that returns person's whereabouts in given timestamps.
For an example: Where was (Jim at 2014-10-09T09:05:00), (Jim at 2014-10-10T09:01:00), (Jill at 2014-10-10T09:10:00), ...
The result set must contain the given User and Timestamp as well as the found room (if any).
------------------------------------------
| User | Timestamp | WasInRoom |
------------------------------------------
| Jim | 2014-10-09T09:05:00 | NULL |
| Jim | 2014-10-09T09:01:00 | A |
| Jim | 2014-10-10T09:10:00 | A |
The number of User-Timestamp tuples can be > 10 000.
The current implementation retrieves all records from Events table and does the search in Java code. I am hoping that I could push this logic to SQL. But how?
I am using MyBatis framework to create SQL queries so the tuples can be inlined to the query.
The basic query is:
select e.*
from events e
where e.user = 'Jim' and '2014-10-09T09:05:00' >= e.entered and ('2014-10-09T09:05:00' <= e.exited or e.exited is NULL) or
e.user = 'Jill' and '2014-10-10T09:10:00 >= e.entered and ('2014-10-10T09:10:00' <= e.exited or e.exited is NULL) or
. . .;
SQL Server can handle ridiculously large queries, so you can continue in this vein. However, if you have the name/time values in a table already (or it is the result of a query), then use a join:
select ut.*, t.*
from usertimes ut left join
events e
on e.user = ut.user and
ut.thetime >= et.entered and (ut.thetime <= exited or ut.exited is null);
Note the use of a left join here. It ensures that all the original rows are in the result set, even when there are no matches.
Answers from Jonas and Gordon got me on track, I think.
Here is query that seems to do the job:
CREATE TABLE #SEARCH_PARAMETERS(User VARCHAR(16), "Timestamp" DATETIME)
INSERT INTO #SEARCH_PARAMETERS(User, "Timestamp")
VALUES
('Jim', '2014-10-09T09:05:00'),
('Jim', '2014-10-10T09:01:00'),
('Jill', '2014-10-10T09:10:00')
SELECT #SEARCH_PARAMETERS.*, Events.Room FROM #SEARCH_PARAMETERS
LEFT JOIN Events
ON #SEARCH_PARAMETERS.User = Events.User AND
#SEARCH_PARAMETERS."Timestamp" > Events.Entered AND
(Events.Exited IS NULL OR Events.Exited > #SEARCH_PARAMETERS."Timestamp"
DROP TABLE #SEARCH_PARAMETERS
By declaring a table valued parameter type for the (user, timestamp) tuples, it should be simple to write a table valued user defined function which returns the desired result by joining the parameter table and the Events table. See http://msdn.microsoft.com/en-us/library/bb510489.aspx
Since you are using MyBatis it may be easier to just generate a table variable for the tuples inline in the query and join with that.

How to get numbers arranged right to left in sql server SELECT statements

When performing SELECT statements including number columns (prices, for example), the result always is left to right ordered, which reduces the readability. Therefore I'm searching a method to format the output of number columns right to left.
I already tried to use something like
SELECT ... SPACE(15-LEN(A.Nummer))+A.Nummer ...
FROM Artikel AS A ...
which gives close results, but depending on font not really. An alternative would be to replace 'SPACE()' with 'REPLICATE('_',...)', but I don't really like the underscores in output.
Beside that this formula will crash on numbers with more digits than 15, therefore I searched for a way finding the maximum length of entries to make it more save like
SELECT ... SPACE(MAX(A.Nummer)-LEN(A.Nummer))+A.Nummer ...
FROM Artikel AS A ...
but this does not work due to the aggregate character of the MAX-function.
So, what's the best way to achieve the right-justified order for the number-columns?
Thanks,
Rainer
To get you problem with the list box solved have a look at this link: http://www.lebans.com/List_Combo.htm
I strongly believe that this type of adjustment should be made in the UI layer and not mixed in with data retrieval.
But to answer your original question i have created a SQL Fiddle:
MS SQL Server 2008 Schema Setup:
CREATE TABLE dbo.some_numbers(n INT);
Create some example data:
INSERT INTO dbo.some_numbers
SELECT CHECKSUM(NEWID())
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))X(x);
The following query is using the OVER() clause to specify that the MAX() is to be applied over all rows. The > and < that the result is wrapped in is just for illustration purposes and not required for the solution.
Query 1:
SELECT '>'+
SPACE(MAX(LEN(CAST(n AS VARCHAR(MAX))))OVER()-LEN(CAST(n AS VARCHAR(MAX))))+
CAST(n AS VARCHAR(MAX))+
'<'
FROM dbo.some_numbers SN;
Results:
| COLUMN_0 |
|---------------|
| >-1486993739< |
| > 1620287540< |
| >-1451542215< |
| >-1257364471< |
| > -819471559< |
| >-1364318127< |
| >-1190313739< |
| > 1682890896< |
| >-1050938840< |
| > 484064148< |
This query does a straight case to show the difference:
Query 2:
SELECT '>'+CAST(n AS VARCHAR(MAX))+'<'
FROM dbo.some_numbers SN;
Results:
| COLUMN_0 |
|---------------|
| >-1486993739< |
| >1620287540< |
| >-1451542215< |
| >-1257364471< |
| >-819471559< |
| >-1364318127< |
| >-1190313739< |
| >1682890896< |
| >-1050938840< |
| >484064148< |
With this query you still need to change the display font to a monospaced font like COURIER NEW. Otherwise, as you have noticed, the result is still misaligned.