Why does DevCenter of Datastax has row restrictions to 1000? - datastax

There is a limit of displaying 1000 rows for the tables in Datastax Devcenter. Any reason for having this option?
Because when queried as SELECT count(*) FROM tablename;
the performance from Cassandra is going to be same whether displaying 1000 records or complete records set.

DevCenter version 1.6.0 introduces result set paging which allows you to browse all the rows in your result set.
In DevCenter 1.6.0 the "with limit" value sets the paging size, e.g. the number of records to view per page and is still limited to 1000 maximum. However, now you can page forward (and back) through all of the query results.
A related new feature allows you to export all results to a file, either as CSV or INSERT statements. Right-click in the results view area and select "Export all results to File as [CSV|Insert]".

This is by design; consider it as a safeguard that prevents you from potentially fetching thousands or millions of rows by accident, which, among other problems, could have a serious impact on your network's bandwidth usage.

When you run the query in Datastax DevCenter 1.6 it displays the 1000 record in result as selected limit but if you export the same result to CSV it will give you all the record which you are looking for.

I run Datastax Devcenter 1.4.
I run the query with a limit and it provides me the actual count.
But LIMIT is limited to maximum value of signed integer (2147483647)
select count(*) from users LIMIT 2147483647;-- ALLOW FILTERING;

Related

Google Bigquery results

I am getting a part of result from the Bigquery API.
Earlier, I solved the issue of 1,00,000 records per result using iterators.
However, now I'm stuck at some other obstacle.
If I take more than 6-7 columns in a result, I do not get the complete set of result.
However, if I take a single column, I get the complete result.
Can there be a size limit as well for results in Bigquery API ?
There are some limits for Query Job
In particular
Maximum response size — 128 MB compressed
Of course, it is unlimited when writing large query results to a destination table (and then reading from there)

Oracle APEX Presenting Data Slower When Row Count bigger than 1.000.000

When type query in SQL Developer, it return data less than a second. When do the same in Oracle APEX it take much more time, over 5 seconds. I go in DEBUG section to see what's wrong, and it return this to me:
-IR binding: "APXWS_MAX_ROW_CNT" value="1000000"
I figure it out, that it returns more than 1.000.000 rows, and that's why is slower. But don't know how to fix it, to get approximately the same time as in SQL Developer?
"Leave the Maximum Row Count property null, so classic reports won't fetch all the way to this number and interactive reports won't introduce the analytic function count(*) over ().
Don't use a Pagination Type with a Z, so classic reports won't fetch all rows and interactive reports again won't introduce count(*) over ()."
source: http://rwijk.blogspot.ca/2016/11/performance-aspects-of-apex-reports.html
(I saved it in the wayback machine too if the link goes away: http://web.archive.org/web/20170706183715/http://rwijk.blogspot.ca/2016/11/performance-aspects-of-apex-reports.html
Put some limits on Maximum Row Count and Maximum Rows per Page can help you to mitigate loading.
You never had same performance as SQL Developer in a web page apex or not.

Google BigQuery unable to process larger result set getting "Response too large to return" or "Resources exceeded during query execution"

I am currently working with large table (~105M Records) in C# application.
When query the table with 'Order by' or 'Order Each by' clause, then i am getting "Resources exceeded during query execution" error.
If i remove 'Order by' or 'Order Each by' clause, then i am getting Response too large to return error.
Here is the sample query for two scenarios (I am using Wikipedia public table)
SELECT Id,Title,Count(*) FROM [publicdata:samples.wikipedia] Group EACH by Id, title Order by Id, Title Desc
SELECT Id,Title,Count(*) FROM [publicdata:samples.wikipedia] Group EACH by Id, title
Here are the questions i have
What is the maximum size of Big Query Response?
How do we select all the records in Query Request not in 'Export Method'?
1. What is the maximum size of Big Query Response?
As it's mentioned on Quota-policy queries maximum response size: 10 GB compressed (unlimited when returning large query results)
2. How do we select all the records in Query Request not in 'Export Method'?
If you plan to run a query that might return larger results, you can set allowLargeResults to true in your job configuration.
Queries that return large results will take longer to execute, even if the result set is small, and are subject to additional limitations:
You must specify a destination table.
You can't specify a top-level ORDER BY, TOP or LIMIT clause. Doing so negates the benefit of using allowLargeResults, because the query output can no longer be computed in parallel.
Window functions can return large query results only if used in conjunction with a PARTITION BY clause.
Read more about how to paginate to get the results here and also read from the BigQuery Analytics book, the pages that start with page 200, where it is explained how Jobs::getQueryResults is working together with the maxResults parameter and int's blocking mode.
Update:
Query Result Size Limitations - Sometimes, it is hard to know what 10 GB of compressed
data means.
When you run a normal query in BigQuery, the response size is limited to 10 GB
of compressed data. Sometimes, it is hard to know what 10 GB of compressed
data means. Does it get compressed 2x? 10x? The results are compressed within
their respective columns, which means the compression ratio tends to be very
good. For example, if you have one column that is the name of a country, there
will likely be only a few different values. When you have only a few distinct
values, this means that there isn’t a lot of unique information, and the column
will generally compress well. If you return encrypted blobs of data, they will
likely not compress well because they will be mostly random. (This is explained on the book linked above on page 220)

Neo4j Configuration for 4M Nodes 10M relationship

I am new to Neo4j and have made few graph queries with 4M nodes and 10M relationships. Till now i've completely surprised from the performance of my queries.
SCHEMA
.......
(a:user{data:1})-[:follow]->(:user)-[:next*1..10]-(:activity)
Here user with data:1 is following another 100,000 user. Each of those 100,000 users have 2-8 next nodes(lets say activity of users) attached. Now i want to fetch the activities of users till next level 3 [:next*1..3] . Each activity has property relevance number.
So now i have 100,000 *3 nodes to traverse.
CYPHER
.......
match (u:user{data:1})-[:follow]-(:user)-[:next*1..3]-(a:activity)
return a order by a.relevance desc limit 50
This query is taking 72000 ms almost every time. Since i am new to Neo4j and i am sure that i haven't done tuning of the OS.
I am using following parameters-
Initial Java Heap Size (in MB)
wrapper.java.initmemory=2000
Maximum Java Heap Size (in MB)
wrapper.java.maxmemory=2456
Default values for the low-level graph engine
neostore.nodestore.db.mapped_memory=25M
neostore.relationshipstore.db.mapped_memory=50M
neostore.propertystore.db.mapped_memory=90M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=130M
Please tell me where i am doing wrong. I read all the documentation from neo4j website but the query time didn't improve.
please tell me how can i configure high performing cache? What should i do so that all the graph loads up in memory? When i see my RAM usage , it is always like 1.8 GB out of 4 Gb. I am using enterprise license on windows (Neo4j 2.0). Please help.
You are actually following not 100k * 3 but, 100k * (2-10)^10 meaning 10^15 paths.
More memory in your machine would make a lot of sense, so try to get 8 or more GB.
Then you can increase the heap, e.g. to 6GB:
wrapper.java.initmemory=6000
wrapper.java.maxmemory=6000
neo4j.properties
neostore.nodestore.db.mapped_memory=100M
neostore.relationshipstore.db.mapped_memory=500M
neostore.propertystore.db.mapped_memory=200M
neostore.propertystore.db.strings.mapped_memory=200M
neostore.propertystore.db.arrays.mapped_memory=10M
If you want to pull your data through, you would most probably want to inverse your query.
match (a:activity),(u:user{data:1})
with a,u
order by a.relevance
desc limit 100
match (followed:user)-[:next*1..3]-(a:activity)
where (followed)-[:follow]-(user)
return a
order by a.relevance
desc limit 50

What is the true row limit for bigquery "Download as CSV"?

The BigQuery Browser tool documentation mentions the limit on CSV exports:
If a query result set has fewer than 16,000 rows, you can download it
as a CSV file.
We've tried this feature and the limit appears to be much smaller than 16,000 rows. Using this query:
SELECT
contract_id,
product,
AVG(element_value)
FROM [******.payout]
GROUP BY contract_id, product
LIMIT 600;
The UI responds with an error:
This result set contains too many rows for direct download. Please use
"Save as Table" and then export...
If I reduce the LIMIT to 500, then the CSV download works.
Can you please confirm the actual row limit. Is the documentation out of date, or is this a bug in the query interface?
This is a bug in the UI -- download as CSV should work with up to 16k rows (or 10 MB of results). This has been fixed internally and the fix should go out with next week's release.