Parsing attributes from sql query in Ruby on Rails [duplicate] - sql

I have request:
Model.group(:p_id).pluck("AVG(desired)")
=> [0.77666666666666667e1, 0.431666666666666667e2, ...]
but when I ran SQL
SELECT AVG(desired) AS desired
FROM model
GROUP BY p_id
I got
-----------------
| desired |
|-----------------|
| 7.76666666666667|
|43.1666666666667 |
| ... |
-----------------
What is the reason of this? Sure I can multiply, but I bet where are should be an explanation.
I found that
Model.group(:p_id).pluck("AVG(desired)").map{|a| a.to_f}
=> [7.76666666666667,43.1666666666667, ...]
Now I'm struggle with other task, I need numbers attributes in pluck so my request is:
Model.group(:p_id).pluck("p_id, AVG(desired)")
how to get correct AVG value in this case?

0.77666666666666667e1 is (almost) 7.76666666666667, they're the same number in two different representations with slightly different precision. If you dump the first one into irb, you'll see:
> 0.77666666666666667e1
=> 7.766666666666667
When you perform an avg in the database, the result has type numeric which ActiveRecord represents using Ruby's BigDecimal. The BigDecimal values are being displayed in scientific notation but that shouldn't make any difference when you format your data for display.
In any case, pluck isn't the right tool for this job, you want to use average:
Model.group(:p_id).average(:desired)
That will give you a Hash which maps p_id to averages. You'll still get the averages in BigDecimals but that really shouldn't be a problem.

Finally I've found solution:
Model.group(:p_id).pluck("p_id, AVG(Round(desired))")
=> [[1,7.76666666666667],[2,43.1666666666667], ...]

Related

How to get only distinct values from list

What I have: A datasource with a string column, let's call it "name".
There are more, but those are not relevant to the question.
The "name" column in the context of a concrete query contains only 2 distinct values:
""
"SomeName"
But any of the two a varying amount of times. There will only be those two.
Now, what I need is: In the context of a summarize statement, I need a column filled with the two distinct values strcated together, so I end up with just "SomeName".
What I have is not meeting this requirement and I cannot bring myself to find a solution for this:
datatable(name:string)["","SomeName","SomeName"] // just to give a minimal reproducible example
| summarize Name = strcat_array(make_list(name), "")
which gives me
| Name
> SomeNameSomeName
but I need just
| Name
> SomeName
I am aware that I need to do some sort of "distinct" somehow and somewhere or maybe there is a completely different solution to get to the same result?
So, my question is: What do I need to change in the shown query to fullfill my requirement?
take_any()
When the function is provided with a single column reference, it will
attempt to return a non-null/non-empty value, if such value is
present.
datatable(name:string)["","SomeName","SomeName", ""]
| summarize take_any(name)
name
SomeName
Fiddle
Wow, just as I posted the question, I found an answer:
datatable(name:string)["","SomeName","SomeName", ""]
| summarize Name = max(name)
I have no idea, why this works for a string column, but here I am.
This results in my desired outcome:
| Name
> SomeName
...which I suppose is probably less efficient than David's answer. So I'll prefer his one.

Meaning of these two queries (sql injection)

Can someone explain why these two queries (sometimes) do cause errors? I googled some explanations but none of them were right. I dont want to fix it. This queries should be actually used for SQL injection attack (I think error based sql injection). Triggered error should be "duplicate entry". I'm trying to found out why are they sometimes counsing errors.
Thanks.
select
count(*)
from
information_schema.tables
group by
concat(version(),
floor(rand()*2));
select
count(*),
concat(version(),
floor(rand()*2))x
from
information_schema.tables
group by
x;
It seems the second one is trying to guess which database the victim of the injection is using.
The second one is giving me this:
+----------+------------------+
| count(*) | x |
+----------+------------------+
| 88 | 10.1.38-MariaDB0 |
| 90 | 10.1.38-MariaDB1 |
+----------+------------------+
Okay, I'm going to post an answer - and it's more of a frame challenge to the question itself.
Basically: this query is silly, and it should be written; find out what it's supposed to do and rewrite it in a way that makes sense.
What does the query currently do?
It looks like it's getting a count of the tables in the current database... except it's grouping by a calculated column. And that column looks like it is Version() and appends either a '0' or a '1' to it (chosen randomly.)
So the end result? Two rows, each with a numerical value, the sum of which adds up to the total number of tables in the current database. If there are 30 tables, you might get 13/17 one time, 19/11 the next, followed by 16/14.
I have a hard time believing that this is what the query is supposed to do. So instead of just trying to fix the "error" - dig in and figure out what piece of data it should be returning - and then rewrite the proc to do it.

Grouping by name of API, but ignoring parameters - Application Insights

I am using application insights to monitor API usage in my application. I am trying to generate a report to list down how many times a particular API was called over the last 2 months. Here is my query
requests
| where timestamp >= ago(24*60h)
| summarize count() by name
| order by count_ desc
The problem is that the 'name' of the API has also got parameters attached along with the URL, and so the same API appears many times in the result set with different parameters (e.g. GET api/getTasks/1, GET api/getTasks/2). I tried to look through the 'requests' schema to check if there is a column that I could use which had the API name without parameters, but couldn't find it. Is there a way to group by 'name' without parameters on insights? Please help with the query. Thanks so much in advance.
This cuts everything after the second slash:
requests
| where timestamp > ago(1d)
| extend idx = indexof(name, "/", indexof(name, "api/") + 4)
| extend strippedname = iff(idx >= 0, substring(name, 0, idx), name)
| summarize count() by strippedname
| order by count_
Another approach (if API surface is small) is to extract values through nested iif operators.

Histograms with ActiveRecord

I have two entity types in a one-to-many association:
A --1:*--> B
I would like to obtain a histogram of counts of b per each a in ActiveRecord.
Something like:
A.id | count(B)
1 | 20
2 | 32
3 | 332
4 | 0
[ {:id=>1, :count=>20},{:id=>2,:count=>32}, ... ]
I could do it directly in mySql but I was wondering the proper way of doing it on ActiveRecord.
As far as I know, it's always going to be a bit of a mix and match of AR and SQL.
Let's say you want to count comments for each post. The following code:
Post.joins(:comments).group("posts.id").count("comments.id")
will produce something like:
{2=>304, 3=>329, 4=>46, 6=>342}
where the hash keys are the post ids. Notice, however, that because of the inner join enforced by joins you will only get posts with existing comments. But, just like in your example, it makes sense to also list post ids that have zero comments.
You might then think to include, rather than joining comments:
Post.includes(:comments).group("posts.id").count("comments.id")
But that won't work. For reasons I am not completely sure about, the includes is ignored by AR, which results in and a database error.
In the end, I resorted to the following much more explicit and sql-ish query:
Post.select("posts.id, COUNT(comments.id) AS comm_count")
.joins("LEFT OUTER JOIN comments ON posts.id=comments.post_id")
.group("posts.id")
This will return an array of actual Post models (not just :id => count hashes) with the added attribute comm_count, which is what you need.
Of course you may add other post attributes to the select, such as the title, the content, etc.

Does changing data type decimal to float cause data loss?

I need to change a couple of fields in my database from:
:decimal, :precision => 8, :scale => 5
to:
:float
Does this migation result in data loss? The current data consists of integers between 0 and 999.
If this migration will impact these numbers already stored, how can I keep this data safe?
Setup: Ruby on Rails 3 running on Heroku (solution would need to work for both PostgreSQL and MySQL).
It will, if you ever need to do exact comparisons or floating point arithmetics on your numbers. Open up PostgreSQL and try this:
select floor(.8::float * 10.0::float); -- 8
select floor((.1::float + .7::float) * 10.0::float); -- 7
See this related question for the reason:
Why do simple doubles like 1.82 end up being 1.819999999645634565360?
Integers between 0 and 999 will fit in either and the data won't be impacted. If it is just integers - why not use ints?