I'm looking for an intelligent way to convert this SQL statement into #NamedQuery --if there is a way at all?
SELECT MONTH(dateField), sum(value) FROM mydb.records where status ='paid' group by MONTH(dateField) order by MONTH(dateField);
I have a JPA #Entity called Record (Hibernate). This details all invoices in the system that are created on daily basis. There will be many entries per month. Each record will have a status of paid, overdue, value, and lots of other info such as name and address of customer and so on etc.
The above statement basically summarises all the data on a month by month basis and sums the value of all paid invoices per month giveing a summary of all invoices paid in January, all paid in Feb and so on.....The result looks somethings like:
datefield value
1 4500
2 5500
3 5669
The only way I can think of doing this using JPA #NamedQuery is to select all records in the table that are of status 'pai'd and then use my Java code to do the sorting, ordering and addition in a rather slow and ugly fashion! Is there a clever way I can do this with #NamedQuery?
Thanks
MONTH() is not a standard JPQL function defined in the JPA spec, however there is an alternative.
Make a view in your database that leverages your databases month function.
create view MONTLY_REVENUE as
SELECT MONTH(dateField),
sum(value)
FROM mydb.records
where status ='paid'
group by MONTH(dateField)
order by MONTH(dateField);
Then just create this entity as you would from any other table with JPA. You can select from the view but will not be able to save and update it, however since your using the aggregates its not like you would anyway.
Related
I have a Snowflake table with the following fields:
Date
Transaction Type
Transaction Speed
Company
The table has millions of rows, so I want to summarize the data which will then feed into Power BI. I want to group by Date, then Transaction Type, then Company, and sum the values in Transaction Speed.
I'm very new to SQL and have created some basic views, but am having trouble creating the summarization. Can anyone give me some guidance?
It is usually helpful to provide an example of what you have tried, but assuming I understand your requirements, you're likely looking for something like this:
SELECT date, transaction_type, company, sum(transaction_speed) as total_transaction_speed
FROM table
GROUP BY date, transaction_type, company;
I created a table named user_preferences where user preferences have been grouped by user_id and month.
Table:
Each month I collect all user_ids and assign all preferences:
city
district
number of rooms
the maximum price they can spend
The plan assumes displaying a graph showing users' shopping intentions like this:
The blue line is the number of interested users for the selected values in the filters.
The graph should enable filtering by parameters marked in red.
What you see above is a simplified form for clarifying the subject. In fact, there are many more users. Every month, the table increases by several hundred thousand records. The SQL query retrieving data (feeding) for chart lasts up to 50 seconds. It's far too much - I can't afford it.
So, I need to create a table (table/aggregation/data mart) where I will be able to insert the previously calculated numer of interested users for all combinations. Thanks to this, the end user will not have to wait for the data to count.
Details below:
Now the question is - how to create such a table in PostgreSQL?
I know how to write a SQL query that will calculate a specific example.
SELECT
month,
count(DISTINCT user_id) interested_users
FROM
user_preferences
WHERE
month BETWEEN '2020-01' AND '2020-03'
AND city = 'Madrid'
AND district = 'Latina'
AND rooms IN (1,2)
AND price_max BETWEEN 400001 AND 500000
GROUP BY
1
The question is - how to calculate all possible combinations? Can I write multiple nested loop in SQL?
The topic is extremely important to me, I think it will also be useful to others for the future.
I will be extremely grateful for any tips.
Well, base on your query, you have the following filters:
month
city
distirct
rooms
price_max
You can try creating a view with the following structure:
SELECT month
,city
,distirct
,rooms
,price_max
,count(DISTINCT user_id)
FROM user_preferences
GROUP BY month
,city
,distirct
,rooms
,price_max
You can make this view materialized. So, the query behind the view will not be executed when queried. It will behave like table.
When you are adding new records to the base table you will need to refresh the view (unfortunately, posgresql does not support auto-refresh like others):
REFRESH MATERIALIZED VIEW my_view;
or you can scheduled a task.
If you are using only exact search for each field, this will work. But in your example, you have criteria like:
month BETWEEN '2020-01' AND '2020-03'
AND rooms IN (1,2)
AND price_max BETWEEN 400001 AND 500000
In such cases, I usually write the same query but SUM the data from the materialized view. In your case, you are using DISTINCT and this may lead to counting a user multiple times.
If this is a issue, you need to precalculate too many combinations and I doubt this is the answer. Alternatively, you can try to normalize your data - this will improve the performance of the aggregations.
I need to calculate values for a record in a database based off of other values in other records. Using SqlServer 2012, what would be the best way to do this? I'm thinking some type of script that runs on the server that may be able to query for the values it needs to compute, compute them, and insert them into the record it needs to. I know you can have computed columns based off of other columns in SqlServer, but what about new records based off of different columns in different records?
Thanks!
EDIT:
I'm using a google charts table on an MVC4 Razor website to show items purchased by specific users by month and year; looks something like this:
Email Address | Purchase Value | Year | Month
This currently works absolutely fine. I query the database for purchases by user and group by month and year and sum the purchases, and I put the values in the table. I also have category filters that only show one month and one year, so only one user is shown at a time.
Now management wants an 'All' selection on the category filter, which means that for every month of every year, and every year total, I'm going to have to compute a cumulative purchase value for each user and put it in the table; you can imagine, if the users list gets very long, this could take some time. So, I think the best option would probably be to have a script that groups purchases by year and by user and updates a new record every time a donation is made anytime within that year; obviously, you'd do the same for each month of the year. That way, I wouldn't have to worry about computing this when the user requests the page. I'm just not sure how to go about writing a script for SQLServer that would be able to do something like this.
This shows how to calculate values for a record in a database based off of other values in other records. The example is written in TSQL and can be executed on SQL Server. You will need to change the script to use your tables and columns.
DECLARE #total dec(12,2), #num int --Variable declaration
SET #total = (SELECT SUM(Salary) FROM Employee) --Capture sum of employee salaries
SET #num = (SELECT COUNT(ID) FROM Employee) --Capture the number of employees
SELECT #total 'Total', --calculate values for a record in a database based off of other values in other records
#num 'Number of employees',
#total/#num 'Average'
INTO
dbo.AverageSalary
Hope this helps.
I have been struggling with creating a query in Access to select a distinct field with the criteria of having the newest entry in the database.
Heres a brief summary of how what my table conssists of. I have a table with surveying data collected from 2007 to the present. We have field with a survey marks name with corresponding adjustment data. In the corresponding data there is field with the adjusmtent date. Many of the marks have been occupied mutiple times and only want to retrieve the most recent occupation information.
Roughly i want to
SELECT DISTINCT STATUS_POINT_DESIGNATION
FROM __ALL_ADJUSTMENTS
WHERE [__ALL_ADJUSMENTS]![ADJ_DATE]=MAX(ADJ_DATE)
I seem to be getting confused how relate the select a distinct value with a constraint. Any Suggestions?
DH
Seems you could achieve your aim of getting the latest observation for each survey point by a summary function:
SELECT STATUS_POINT_DESIGNATION, Max(ADJ_DATE) AS LatestDate, Count(STATUS_POINT_DESIGNATION) AS Observations
FROM __ALL_ADJUSTMENTS
GROUP BY STATUS_POINT_DESIGNATION;
I need to write a SQL query that pulls from a table that stores records for each time one of our salespeople speaks to a client. The relevant columns are: (1) the salesperson's employee ID, (2) the client's account number, and (2) the date of the conversation.
It's often the case that salespeople have spoken to clients multiple times within the report period (a calendar month) so there will be several entries that are nearly identical except for the date.
Where I'm getting tripped up is that, for the purpose of this query, I need to return only one record per salesperson/client combination, but I can't use DISTINCT because I need to include the date of the most recent conversation within the reporting period.
So, if salesperson John has spoken to client ABC on 10/10, 10/18, and 10/25 I need to pull the 10/25 record but not the others.
It's a Sybase database.
I have the feeling that I may be missing something simple here but I've tried searching and remain stumped. Any help is greatly appreciated.
Thanks for your time,
John
Guessing at the column names...
SELECT employee_id, client_acct_no,
MAX(conversation_date) AS MOST_RECENT_CONV_DATE
FROM mytable
WHERE conversation_date BETWEEN DATE '2010-10-01' AND DATE '2010-10-31'
GROUP BY employee_id, client_acct_no
Documentation for GROUP BY clause.