I have set up a 3 node Apache Ignite cluster and noticed the following unexpected behavior:
(Tested with Ignite 2.10 and 2.13, Azul Java 11.0.13 on RHEL 8)
We have a relational table "RELATIONAL_META". It's created by our software vendors product that uses Ignite to exchange configuration data. This table is backed by this cache, that gets replicated to all nodes:
[cacheName=SQL_PUBLIC_RELATIONAL_META, cacheId=-252123144, grpName=null, grpId=-252123144, prim=512, mapped=512, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, affCls=RendezvousAffinityFunction]
Seen behavior:
I did a failure test, simulating a disk failure of one of the Ignite nodes. The "failed" node restarts with an empty disk and joins the topology as expected. While the node is not yet part of the baseline nodes, either because auto-adjust is disabled, or auto-adjust did not yet complete, the restarted node returns empty results via the JDBC connection:
0: jdbc:ignite:thin://b2bivmign2/> select * from RELATIONAL_META;
+------------+--------------+------+-------+---------+
| CLUSTER_ID | CLUSTER_TYPE | NAME | VALUE | DETAILS |
+------------+--------------+------+-------+---------+
+------------+--------------+------+-------+---------+
No rows selected (0.018 seconds)
It's interesting that it knows the structure of the table, but not the contained data.
The table actually contains data, as I can see when I query against one of the other cluster nodes:
0: jdbc:ignite:thin://b2bivmign1/> select * from RELATIONAL_META;
+-------------------------+--------------+----------------------+-------+------------------------------------------------------------------------------------------------------------------+
| CLUSTER_ID | CLUSTER_TYPE | NAME | VALUE | DETAILS |
+-------------------------+--------------+----------------------+-------+------------------------------------------------------------------------------------------------------------------+
| cluster_configuration_1 | writer | change index | 1653 | 2023-01-24 10:25:27 |
| cluster_configuration_1 | writer | last run changes | 0 | Updated at 2023-01-29 11:08:48. |
| cluster_configuration_1 | writer | require full sync | false | Flag set to false on 2022-06-11 09:46:45 |
| cluster_configuration_1 | writer | schema version | 1.4 | Updated at 2022-06-11 09:46:25. Previous version was 1.3 |
| cluster_processing_1 | reader | STOP synchronization | false | Resume synchronization - the processing has the same version as the config - 2.6-UP2022-05 [2023-01-29 11:00:50] |
| cluster_processing_1 | reader | change index | 1653 | 2023-01-29 10:20:39 |
| cluster_processing_1 | reader | conflicts | 0 | Reset due to full sync at 2022-06-11 09:50:12 |
| cluster_processing_1 | reader | require full sync | false | Cleared the flag after full reader sync at 2022-06-11 09:50:12 |
| cluster_processing_2 | reader | STOP synchronization | false | Resume synchronization - the processing has the same version as the config - 2.6-UP2022-05 [2023-01-29 11:00:43] |
| cluster_processing_2 | reader | change index | 1653 | 2023-01-29 10:24:06 |
| cluster_processing_2 | reader | conflicts | 0 | Reset due to full sync at 2022-06-11 09:52:19 |
| cluster_processing_2 | reader | require full sync | false | Cleared the flag after full reader sync at 2022-06-11 09:52:19 |
+-------------------------+--------------+----------------------+-------+------------------------------------------------------------------------------------------------------------------+
12 rows selected (0.043 seconds)
Expected behavior:
While a node is not part of the baseline, it is per definition not persisting data. So when I run a query against it, I would expect it to fetch the partitions that it does not hold itself, from the other nodes of the cluster. Instead it just shows an empty result, even showing the correct structure of the table, just without any rows. This has caused inconsistent behavior in the product we're actually running, that uses Ignite as a configuration store, because suddenly the nodes see different results depending on which node they have opened their JDBC connection to. We are using a JDBC connection string that contains all the Ignite server nodes, so it fails over when one goes down, but of course it does not prevent the issue I have described here.
Is this "works a designed"? Is there any way to prevent such issues? It seems to be problematic to use Apache Ignite as a configuration store for an application with many nodes, when it behaves like this.
Regards,
Sven
Update:
After restarting one of the nodes with an empty disk, it joins as a node with a new ID. I think that is expected behavior. We have enabled baseline auto-adjust, so the new node id should join the baseline, and old one should leave the baseline. This works, but before this is completed, the node returns empty results to SQL queries.
Cluster state: active
Current topology version: 95
Baseline auto adjustment enabled: softTimeout=60000
Baseline auto-adjust is in progress
Current topology version: 95 (Coordinator: ConsistentId=cdf43fef-deb8-4732-907f-6264bd55de6f, Address=b2bivmign3.fritz.box/192.168.0.151, Order=11)
Baseline nodes:
ConsistentId=3ffe3798-9a63-4dc7-b7df-502ad9efc76c, Address=b2bivmign1.fritz.box/192.168.0.149, State=ONLINE, Order=64
ConsistentId=40a8ae8c-5f21-4f47-8f67-2b68f396dbb9, State=OFFLINE
ConsistentId=cdf43fef-deb8-4732-907f-6264bd55de6f, Address=b2bivmign3.fritz.box/192.168.0.151, State=ONLINE, Order=11
--------------------------------------------------------------------------------
Number of baseline nodes: 3
Other nodes:
ConsistentId=080fc170-1f74-44e5-8ac2-62b94e3258d9, Order=95
Number of other nodes: 1
Update 2:
This is the JDDB URL the application uses:
#distributed.jdbc.url - run configure to modify this property
distributed.jdbc.url=jdbc:ignite:thin://b2bivmign1.fritz.box:10800..10820,b2bivmign2.fritz.box:10800..10820,b2bivmign3.fritz.box:10800..10820
#distributed.jdbc.driver - run configure to modify this property
distributed.jdbc.driver=org.apache.ignite.IgniteJdbcThinDriver
We have seen it connecting via JDBC to a node that was not part of the baseline and therefore receiving empty results. I wonder why a node that is not part of the baseline returns any results without fetching the data from the baseline nodes?
Update 3:
It seems to be dependent on the tables/caches attributes wether this happens, I cannot yet reproduce it with a table I create on my own, only with the table that is created by the product we use.
This is the cache of the table that I can reproduce the issue with:
[cacheName=SQL_PUBLIC_RELATIONAL_META, cacheId=-252123144, grpName=null, grpId=-252123144, prim=512, mapped=512, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, affCls=RendezvousAffinityFunction]
I have created 2 tables my own for testing:
CREATE TABLE Test (
Key CHAR(10),
Value CHAR(10),
PRIMARY KEY (Key)
) WITH "BACKUPS=2";
CREATE TABLE Test2 (
Key CHAR(10),
Value CHAR(10),
PRIMARY KEY (Key)
) WITH "BACKUPS=2,atomicity=ATOMIC";
I then shut down one of the Ignite nodes, in this case b2bivmign3, and remove the ignite data folders, then start it again. It starts as a new node that is not part of the baseline, and I disabled auto-adjust to just keep that situation. I then connect to b2bivmign3 with the SQL CLI and query the tables:
0: jdbc:ignite:thin://b2bivmign3/> select * from Test;
+------+-------+
| KEY | VALUE |
+------+-------+
| Sven | Demo |
+------+-------+
1 row selected (0.202 seconds)
0: jdbc:ignite:thin://b2bivmign3/> select * from Test2;
+------+-------+
| KEY | VALUE |
+------+-------+
| Sven | Demo |
+------+-------+
1 row selected (0.029 seconds)
0: jdbc:ignite:thin://b2bivmign3/> select * from RELATIONAL_META;
+------------+--------------+------+-------+---------+
| CLUSTER_ID | CLUSTER_TYPE | NAME | VALUE | DETAILS |
+------------+--------------+------+-------+---------+
+------------+--------------+------+-------+---------+
No rows selected (0.043 seconds)
The same when I connect to one of the other Ignite nodes:
0: jdbc:ignite:thin://b2bivmign2/> select * from Test;
+------+-------+
| KEY | VALUE |
+------+-------+
| Sven | Demo |
+------+-------+
1 row selected (0.074 seconds)
0: jdbc:ignite:thin://b2bivmign2/> select * from Test2;
+------+-------+
| KEY | VALUE |
+------+-------+
| Sven | Demo |
+------+-------+
1 row selected (0.023 seconds)
0: jdbc:ignite:thin://b2bivmign2/> select * from RELATIONAL_META;
+-------------------------+--------------+----------------------+-------+------------------------------------------------------------------------------------------------------------------+
| CLUSTER_ID | CLUSTER_TYPE | NAME | VALUE | DETAILS |
+-------------------------+--------------+----------------------+-------+------------------------------------------------------------------------------------------------------------------+
| cluster_configuration_1 | writer | change index | 1653 | 2023-01-24 10:25:27 |
| cluster_configuration_1 | writer | last run changes | 0 | Updated at 2023-01-29 11:08:48. |
| cluster_configuration_1 | writer | require full sync | false | Flag set to false on 2022-06-11 09:46:45 |
| cluster_configuration_1 | writer | schema version | 1.4 | Updated at 2022-06-11 09:46:25. Previous version was 1.3 |
| cluster_processing_1 | reader | STOP synchronization | false | Resume synchronization - the processing has the same version as the config - 2.6-UP2022-05 [2023-01-29 11:00:50] |
| cluster_processing_1 | reader | change index | 1653 | 2023-01-29 10:20:39 |
| cluster_processing_1 | reader | conflicts | 0 | Reset due to full sync at 2022-06-11 09:50:12 |
| cluster_processing_1 | reader | require full sync | false | Cleared the flag after full reader sync at 2022-06-11 09:50:12 |
| cluster_processing_2 | reader | STOP synchronization | false | Resume synchronization - the processing has the same version as the config - 2.6-UP2022-05 [2023-01-29 11:00:43] |
| cluster_processing_2 | reader | change index | 1653 | 2023-01-29 10:24:06 |
| cluster_processing_2 | reader | conflicts | 0 | Reset due to full sync at 2022-06-11 09:52:19 |
| cluster_processing_2 | reader | require full sync | false | Cleared the flag after full reader sync at 2022-06-11 09:52:19 |
+-------------------------+--------------+----------------------+-------+------------------------------------------------------------------------------------------------------------------+
12 rows selected (0.032 seconds)
I will test more tomorrow the find out which attribute of the table/cache enables this issue.
Update 4:
I can reproduce this with a table that is set to mode=REPLICATED instead of PARTITIONED.
CREATE TABLE Test (
Key CHAR(10),
Value CHAR(10),
PRIMARY KEY (Key)
) WITH "BACKUPS=2";
[cacheName=SQL_PUBLIC_TEST, cacheId=-2066189417, grpName=null, grpId=-2066189417, prim=1024, mapped=1024, mode=PARTITIONED, atomicity=ATOMIC, backups=2, affCls=RendezvousAffinityFunction]
CREATE TABLE Test2 (
Key CHAR(10),
Value CHAR(10),
PRIMARY KEY (Key)
) WITH "BACKUPS=2,TEMPLATE=REPLICATED";
[cacheName=SQL_PUBLIC_TEST2, cacheId=372637563, grpName=null, grpId=372637563, prim=512, mapped=512, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, affCls=RendezvousAffinityFunction]
0: jdbc:ignite:thin://b2bivmign2/> select * from TEST;
+------+-------+
| KEY | VALUE |
+------+-------+
| Sven | Demo |
+------+-------+
1 row selected (0.06 seconds)
0: jdbc:ignite:thin://b2bivmign2/> select * from TEST2;
+-----+-------+
| KEY | VALUE |
+-----+-------+
+-----+-------+
No rows selected (0.014 seconds)
Testing with Visor:
It makes no difference where I run Visor, same results.
We see both caches for the tables have 1 entry:
+-----------------------------------------+-------------+-------+---------------------------------+-----------------------------------+-----------+-----------+-----------+-----------+
| SQL_PUBLIC_TEST(#c9) | PARTITIONED | 3 | 1 (0 / 1) | min: 0 (0 / 0) | min: 0 | min: 0 | min: 0 | min: 0 |
| | | | | avg: 0.33 (0.00 / 0.33) | avg: 0.00 | avg: 0.00 | avg: 0.00 | avg: 0.00 |
| | | | | max: 1 (0 / 1) | max: 0 | max: 0 | max: 0 | max: 0 |
+-----------------------------------------+-------------+-------+---------------------------------+-----------------------------------+-----------+-----------+-----------+-----------+
| SQL_PUBLIC_TEST2(#c10) | REPLICATED | 3 | 1 (0 / 1) | min: 0 (0 / 0) | min: 0 | min: 0 | min: 0 | min: 0 |
| | | | | avg: 0.33 (0.00 / 0.33) | avg: 0.00 | avg: 0.00 | avg: 0.00 | avg: 0.00 |
| | | | | max: 1 (0 / 1) | max: 0 | max: 0 | max: 0 | max: 0 |
+-----------------------------------------+-------------+-------+---------------------------------+-----------------------------------+-----------+-----------+-----------+-----------+
One is empty when I scan it, the other has one row as expected:
visor> cache -scan -c=#c9
Entries in cache: SQL_PUBLIC_TEST
+================================================================================================================================================+
| Key Class | Key | Value Class | Value |
+================================================================================================================================================+
| java.lang.String | Sven | o.a.i.i.binary.BinaryObjectImpl | SQL_PUBLIC_TEST_466f2363_47ed_4fba_be80_e33740804b97 [hash=-900301401, VALUE=Demo] |
+------------------------------------------------------------------------------------------------------------------------------------------------+
visor> cache -scan -c=#c10
Cache: SQL_PUBLIC_TEST2 is empty
visor>
Update 5:
I have reduced the configuration file to this:
https://pastebin.com/dL9Jja8Z
I did not manage to reproduce this with persistence turned off, as I don't manage to keep a node out the baseline then, it always joins immediately. So maybe this problem is only reproducible with persistence enabled.
I go to each of the 3 nodes, remove the Ignite data to start from scratch, and start the service:
[root#b2bivmign1,2,3 apache-ignite]# rm -rf db/ diagnostic/ snapshots/
[root#b2bivmign1,2,3 apache-ignite]# systemctl start apache-ignite#b2bi-config.xml.service
I open visor, check the topology that all nodes have joined, then activate the cluster.
https://pastebin.com/v0ghckBZ
visor> top -activate
visor> quit
I connect with sqlline and create my tables:
https://pastebin.com/Q7KbjN2a
I go to one of the servers, stop the service and delete the data, then start the service again:
[root#b2bivmign2 apache-ignite]# systemctl stop apache-ignite#b2bi-config.xml.service
[root#b2bivmign2 apache-ignite]# rm -rf db/ diagnostic/ snapshots/
[root#b2bivmign2 apache-ignite]# systemctl start apache-ignite#b2bi-config.xml.service
Baseline looks like this:
https://pastebin.com/CeUGYLE7
Connect with sqlline to that node, issue reproduces:
https://pastebin.com/z4TMKYQq
This was reproduced on:
openjdk version "11.0.18" 2023-01-17 LTS
OpenJDK Runtime Environment Zulu11.62+17-CA (build 11.0.18+10-LTS)
OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode)
RPM: apache-ignite-2.14.0-1.noarch
Rocky Linux release 8.7 (Green Obsidian)
Hello everyone and thank you in advanced!
I'm having trouble to find a query to get a list of users that have queried some specifics views.
A example to clarify, if I have a couple of views
user_activity_last_6_months &
user_compliance_last_month
I need to know who is querying those 2 views and if posible other statistics. This could be a desired output.
+--------+-----------------------------+----------+----------------------------+----------------------------+----------------+-------------------+----------------------+------------------+
| userid | view_name | queryid | starttime | endtime | query_cpu_time | query_blocks_read | query_execution_time | return_row_count |
+--------+-----------------------------+----------+----------------------------+----------------------------+----------------+-------------------+----------------------+------------------+
| 293 | user_activity_last_6_months | 88723456 | 2018-05-08 13:08:08.727686 | 2018-05-08 13:08:12.423532 | 4 | 1023 | 6 | 435 |
| 345 | user_compliance_last_month | 99347882 | 2018-05-10 00:00:03.049967 | 2018-05-10 00:00:09.177362 | 6 | 345 | 8 | 214 |
| 345 | user_activity_last_6_months | 99347883 | 2018-05-10 12:27:36.637483 | 2018-05-10 12:27:44.502705 | 8 | 14 | 9 | 13 |
| 293 | user_compliance_last_month | 99347884 | 2018-05-10 12:31:00.433556 | 2018-05-10 12:31:30.090183 | 30 | 67 | 35 | 7654 |
+--------+-----------------------------+----------+----------------------------+----------------------------+----------------+-------------------+----------------------+------------------+
I have developed a query to get this info but for tables in the database using system tables and views, but I can't find any clue to get the same results for views.
As I've said, the first 3 columns are mandatory and the others will be nice to have. Plus, any further information is welcome!!
Thank you all!!
If you need that level of auditing for table and view access then I recommend you start by enabling Database Audit Logging for your Redshift cluster. This will generate a number of logs files in S3.
The "User Activity Log" contains the text for all queries run on the cluster, it can then either be loaded back into Redshift or added as a Spectrum table so that the query text can be parsed for table and view names.
I have the follow bigquery table:
+---------------------+-----------+-------------------------+-----------------+
| links.href | links.rel | dados.dataHora | dados.sequencia |
+---------------------+-----------+-------------------------+-----------------+
| https://www.url.com | self | 2017-03-16 16:27:10 UTC | 2 |
| | | 2017-03-16 16:35:34 UTC | 1 |
| | | 2017-03-16 19:50:32 UTC | 3 |
+---------------------+-----------+-------------------------+-----------------+
and I want select all rows. So, I try the follow query:
SELECT * FROM [my_project:a_import.my_table] LIMIT 100
But, I have a bad (and sad) error:
Error: Cannot output multiple independently repeated fields at the same time. Found links_rel and dados_dataHora
Please, can anybody help me?
I have reworked our API's logging system to use Azure Table Storage from using SQL storage for cost and performance reasons. I am now migrating our legacy logs to the new system. I am building a SQL query per table that will map the old fields to the new ones, with the intention of exporting to CSV then importing into Azure.
So far, so good. However, one artifact of the previous system is that it logged 3 times per request - call begin, call response and call end - and the new one logs the call as just one log (again, for cost and performance reasons).
Some fields common are common to all three related logs, e.g. the Session which uniquely identifies the call.
Some fields I only want the first log's value, e.g. Date which may be a few seconds different in the second and third log.
Some fields are shared for the three different purposes, e.g. Parameters gives the Input Model for Call Begin, Output Model for Call Response, and HTTP response (e.g. OK) for Call End.
Some fields are unused for two of the purposes, e.g. ExecutionTime is -1 for Call Begin and Call Response, and a value in ms for Call End.
How can I "roll up" the sets of 3 rows into one row per set? I have tried using DISTINCT and GROUP BY, but the fact that some of the information collides is making it very difficult. I apologize that my SQL isn't really good enough to really explain what I'm asking for - so perhaps an example will make it clearer:
Example of what I have:
SQL:
SELECT * FROM [dbo].[Log]
Results:
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
| Session | Date | Level | Context | Message | ExecutionTime | Parameters | |
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call Begin | -1 | {"Input":"xx"} | |
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call Response | -1 | {"Output":"yy"} | |
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call End | 123 | OK | |
| F76BCBB | 2014-07-20 19:16:17 | ERROR | GET v1/def | Call Begin | -1 | {"Input":"ww"} | |
| F76BCBB | 2014-07-20 19:16:18 | ERROR | GET v1/def | Call Response | -1 | {"Output":"vv"} | |
| F76BCBB | 2014-07-20 19:16:18 | ERROR | GET v1/def | Call End | 456 | BadRequest | |
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
Example of what I want:
SQL:
[Need to write this query]
Results:
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
| Date | Level | Context | Message | ExecutionTime | InputModel | OutputModel | HttpResponse |
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
| 2014-07-20 19:16:15 | INFO | GET v1/abc | Api Call | 123 | {"Input":"xx"} | {"Output":"yy"} | OK |
| 2014-07-20 19:16:17 | ERROR | GET v1/def | Api Call | 456 | {"Input":"ww"} | {"Output":"vv"} | BadRequest |
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
select L1.Session, L1.Date, L1.Level, L1.Context, 'Api Call' AS Message,
L3.ExecutionTime,
L1.Parameters as InputModel,
L2.Parameters as OutputModel,
L3.Parameters as HttpResponse
from Log L1
inner join Log L2 ON L1.Session = L2.Session
inner join Log L3 ON L1.Session = L3.Session
where L1.Message = 'Call Begin'
and L2.Message = 'Call Response'
and L3.Message = 'Call End'
This would work in your sample.
So I have been doing pretty well on my project (Link to previous StackOverflow question), and have managed to learn quite a bit, but there is this one problem that has been really dogging me for days and I just can't seem to solve it.
It has to do with using the UNIX_TIMESTAMP call to convert dates in my SQL database to UNIX time-format, but for some reason only one set of dates in my table is giving me issues!
==============
So these are the values I am getting -
#abridged here, see the results from the SELECT statement below to see the rest
#of the fields outputted
| firstVst | nextVst | DOB |
| 1206936000 | 1396238400 | 0 |
| 1313726400 | 1313726400 | 278395200 |
| 1318910400 | 1413604800 | 0 |
| 1319083200 | 1413777600 | 0 |
when I use this SELECT statment -
SELECT SQL_CALC_FOUND_ROWS *,UNIX_TIMESTAMP(firstVst) AS firstVst,
UNIX_TIMESTAMP(nextVst) AS nextVst, UNIX_TIMESTAMP(DOB) AS DOB FROM people
ORDER BY "ref DESC";
So my big question is: why in the heck are 3 out of 4 of my DOBs being set to date of 0 (IE 12/31/1969 on my PC)? Why is this not happening in my other fields?
I can see the data quite well using a more simple SELECT statement and the DOB field looks fine...?
#formatting broken to change some variable names etc.
select * FROM people;
| ref | lastName | firstName | DOB | rN | lN | firstVst | disp | repName | nextVst |
| 10001 | BlankA | NameA | 1968-04-15 | 1000000 | 4600000 | 2008-03-31 | Positive | Patrick Smith | 2014-03-31 |
| 10002 | BlankB | NameB | 1978-10-28 | 1000001 | 4600001 | 2011-08-19 | Positive | Patrick Smith | 2011-08-19 |
| 10003 | BlankC | NameC | 1941-06-08 | 1000002 | 4600002 | 2011-10-18 | Positive | Patrick Smith | 2014-10-18 |
| 10004 | BlankD | NameD | 1952-08-01 | 1000003 | 4600003 | 2011-10-20 | Positive | Patrick Smith | 2014-10-20 |
It's because those DoB's are from before 12/31/1969, and the UNIX epoch starts then, so anything prior to that would be negative.
From Wikipedia:
Unix time, or POSIX time, is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, not counting leap seconds.
A bit more elaboration: Basically what you're trying to do isn't possible. Depending on what it's for, there may be a different way you can do this, but using UNIX timestamps probably isn't the best idea for dates like that.