explain the two conversions used in between hive date functions? - hive

i am trying to count the number of records in the particular date.
eventually, got the query worked but confused between these two queries which seemed to same for me. why should i enclose the date_time instead of quote in the conversion.
when i hit the query,
select count(*) from TABLENAME
where FROM_UNIXTIME(UNIX_TIMESTAMP(date_time), 'yyyyMMdd')='20170312';
result is count of the particular date is arrived.
but when i hit,
select count(*) from TABLENAME
where FROM_UNIXTIME(UNIX_TIMESTAMP('date_time', 'yyyyMMdd'))='20170312';
the result is 0.
please explain the difference of these queries.

date_time is a column while 'date_time' is a string and the attempt to use it as date result in NULL.
If you want to qualify the column name you should use `date_time`

Related

SQL Return row from MAX() function

My first foray into SQL and I'm having some difficulty applying the MAX() function.
If I run the following, I receive the correct returned value:
SELECT MAX(count)
FROM readings
However, when I try to also return fields related to that value, I get an incorrect return. Running the following returns the correct 'count' value but an incorrect 'location' value
SELECT MAX(count), location
FROM readings
What I expected from the above, are results the same as from:
SELECT count, location
FROM readings
ORDER BY count DESC
LIMIT 1
Could you please advise if it is possible to achieve this using the MAX() function or if I have just misunderstood what MAX actually does!
Your advice is greatly appreciated.
What database system are you using? MAX is an aggregation function that should be operating across the entire table, while selecting a single value (like location in your query) is operating only on a single row. In most databases, if you want to select another column, you must specify that column in a GROUP BY clause or also wrap it in a similar aggregation function.
To get the value of location in the same row, you typically should use a subselect, like this:
SELECT count, location
FROM readings
WHERE count = (SELECT MAX(count) FROM readings);
Note that this doesn't guarantee a single result, though; there could be several rows that match the maximum count value!

SQL query does not return correct results

I am trying to filter between two dates on a SQL server from a PHP process.
This is the query:
select *
from webStocks
where FECHAMODIFICADO between '2020-06-03 17:16:02' and '2020-06-04 17:16:03'
ORDER BY webStocks.FECHAMODIFICADO DESC
This is the result:
The result is not what I want. In the table I have the following information and it should be the result.
What am I doing wrong in the query?:(
I'd try to make sure the date column actually contains 'timestamp' data type.
If it doesn't, the following code should fix it:
SELECT *
FROM webStocks
where CAST(FECHAMODIFICADO AS timestamp) BETWEEN '2020-06-03 17:16:02' AND '2020-06-04 17:16:03'
ORDER BY webStocks.FECHAMODIFICADO DESC
You can see more information about this kind of statements here.
(this solution is valid mostly for MySQL, but will probably work with either CAST or CONVERT statement with other SQL servers).
SQL tables represent unordered sets. That means that when you run a query with no ORDER BY, the results can be in any order -- and even in different orders on different runs.
In this case, you have an ORDER BY. But the column has duplicates. The same principle applies: rows with the same key value can be in any order -- and even in different orders on different runs.
So, you need to add a new key to capture the order that you want. It is not obvious from your data. But the results would at least be stable if you used:
ORDER BY webStocks.FECHAMODIFICADO DESC, CodeArticulo
It is also odd that your WHERE clause includes very specific times. But the data in these rows is all occurring at midnight. Usually midnight is not such an active time, if the time stamps represent human behavior.

T-Sql Combining Multiple Columns into One Column

How do I combine the calculation date columns all into one column? What's the SQL function to make this happen? They rest of the fields are distinct values based on the calculation date. I only need the distinct values associated with the dates.
EDIT
I tried the ISNULL and COALESCE functions and this is not what I'm looking for because it still brings back all the values for both of the dates. I only need the data as of the date for select accounts. I don't want the data for both dates on the same account.
I also tried the Select Distinct and it's not working for me.
You can use COALESCE
SELECT COALESCE(Calculation_Date, Calculation_Date)
FROM tableName
Assuming only 1 of them will ever have a value, one option is to use coalesce:
select coalesce(date1, date2)
from yourtable
Since you have only two columns, an alternative is to use ISNULL:
SELECT ISNULL(FIRST_CALCULATION_DATE, SECOND_CALCULATION_DATE) AS ActualCalculationDate
FROM TheTable
You should receive the same results as for COALESCE, but it is interesting to know that there are some subtle differences between them when it comes to determining result type.

Select record online by max online ordered by date

Needs help in sql:
I need to group max online of each day by days
(http://prntscr.com/a7j2sm)
my sql select:
SELECT id, date, MAX(online)
FROM `record_online_1`
GROUP BY DAY(date)
and result - http://prntscr.com/a7j3sp
This is incorrect result because, max online is correct, but date and id of this top online is incorrect. I dont have ideas how solve this issue..
UPD: using MySQL MariaDB
When you perform an aggregate functions, you have to include items in the SELECT statement that aren't a part of an aggregate function in the GROUP BY clause. In T-SQL, you simply cannot execute the above query if you don't also GROUP BY "id" for example. However, some database systems allow you to forego this rule, but it's not smart enough to know which ID it should bring back to you. You should only be doing this if, for example, all "ids" for that segment are the same.
So what should you do? Do this in two steps. Step one, find the max values. You will lose the ID and DATETIME data.
SELECT DAY(date) AS Date, MAX(online) AS MaxOnline
FROM `record_online_1` GROUP BY DAY(date)
The above will get you a list of dates with the max for each day. INNER JOIN this to the original "record_online_1" table, joining specifically on the date and max value. You can use a CTE, temp table, subquery, etc to do this.
EDIT: I found an answer that is more eloquent than my own.

Getting additional info on the result of a SQL max query

Say I want to do this with SQL (Sybase): Find all fields of the record with the latest timestamp.
One way to write that is like this:
select * from data where timestamp = (select max(timestamp) from data)
This is a bit silly because it causes two queries - first to find the max timestamp, and then to find all the data for that timestamp (assume it's unique, and yes - i do have an index on timestamp). More so it just seems unnecessary because max() has already found the row that I am interested in so looking for it again is wasteful.
Is there a way to directly access fields of the row that max() returns?
Edit: All answers I see are basically clever hacks - I was looking for a syntactic way of doing something like max(field1).field2 to access field2 of the row with max field1
SELECT TOP 1 * from data ORDER BY timestamp DESC
No, using an aggregate means that you are automatically grouping, so there isn't a single row to get data from even if the group happens to contain a single row.
You can order by the field and get the first row:
set rowcount 1
select * from data order by timestamp desc
(Note that you shouldn't use select *, but rather specify the fields that you want from the query. That makes the query less sensetive to changes in the database layout.)
Can you try this
SELECT TOP 1 *
FROm data
ORDER BY timestamp DESC
You're making assumptions about how Sybase optimizes queries. For all you know, it may do precisely what you want it to do - it may notice both queries are from "data" and that the condition is "where =", and may optimize as you suggest.
I know in the case of SQL Server, it's possible to configure indexes to include fields from the indexed row. Doing a select through such an index leaves those fields available.
This is SQL server, but you'll get the idea.
SELECT TOP(1) * FROM data
ORDER BY timestamp DESC;