IBM TSM 7.1 reclamation not working on primary pool

IBM TSM 7.1 reclamation not working on primary pool - backup

I'm having a problem on IBM TSM 7.1 my test enviroment with primary pool space reclamation.
while there are multiple volumes with reclaimable space i allways get this error:
tsm: SERVER1>reclaim stg stgnasdisk thr=60 dura=60 wait=yes
ANR2111W RECLAIM STGPOOL: There is no data to process for STGNASDISK.
ANS8001I Return code 11.
Q vol snippet:
E:\DATA\V306237716.BFS STGNASDISK PRIMARY 5.0 G 1.0 Full
E:\DATA\V306237717.BFS STGNASDISK PRIMARY 5.0 G 2.0 Full
E:\DATA\V306237718.BFS STGNASDISK PRIMARY 5.0 G 0.5 Full
E:\DATA\V306237719.BFS STGNASDISK PRIMARY 4.9 G 91.3 Full
E:\DATA\V306237720.BFS STGNASDISK PRIMARY 5.0 G 75.9 Full
E:\DATA\V306237721.BFS STGNASDISK PRIMARY 5.0 G 3.0 Full
E:\DATA\V306237722.BFS STGNASDISK PRIMARY 5.0 G 0.5 Full
E:\DATA\V306237723.BFS STGNASDISK PRIMARY 5.0 G 16.9 Full
E:\DATA\V306237724.BFS STGNASDISK PRIMARY 5.0 G 0.3 Full
E:\DATA\V34160080.BFS STGNASDISK PRIMARY 4.9 G 19.5 Full
E:\DATA\V34160081.BFS STGNASDISK PRIMARY 5.0 G 75.9 Full
E:\DATA\V34160082.BFS STGNASDISK PRIMARY 4.9 G 49.1 Full
E:\DATA\V34160083.BFS STGNASDISK PRIMARY 5.0 G 81.6 Full
Volume shows 99% reclaimable space, still nothing happens:
tsm: SERVER1>q vol E:\DATA\V306237716.BFS f=d
Volume Name: E:\DATA\V306237716.BFS
Storage Pool Name: STGNASDISK
Device Class Name: PRIMARY
Estimated Capacity: 5.0 G
Scaled Capacity Applied:
Pct Util: 1.0
Volume Status: Full
Access: Read/Write
Pct. Reclaimable Space: 99.0
Scratch Volume?: No
In Error State?: No
Number of Writable Sides: 1
Number of Times Mounted: 6
Write Pass Number: 1
Approx. Date Last Written: 07/24/2015 12:16:54
Approx. Date Last Read: 08/14/2015 09:33:45
Date Became Pending:
Number of Write Errors: 0
Number of Read Errors: 0
Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator): SERVER_CONSOLE
Last Update Date/Time: 07/24/2015 12:10:04
Begin Reclaim Period:
End Reclaim Period:
Drive Encryption Key Manager:
Logical Block Protected: No
Copy pool reclaims work fine. Storage pool detailed overview:
tsm: SERVER1>q stg stgnasdisk f=d
Storage Pool Name: STGNASDISK
Storage Pool Type: Primary
Device Class Name: PRIMARY
Estimated Capacity: 558 G
Space Trigger Util: 38.4
Pct Util: 35.7
Pct Migr: 35.7
Pct Logical: 49.1
High Mig Pct: 90
Low Mig Pct: 70
Migration Delay: 0
Migration Continue: Yes
Migration Processes: 1
Reclamation Processes: 1
Next Storage Pool:
Reclaim Storage Pool:
Maximum Size Threshold: No Limit
Access: Read/Write
Description:
Overflow Location:
Cache Migrated Files?:
Collocate?: Group
Reclamation Threshold: 100
Offsite Reclamation Limit:
Maximum Scratch Volumes Allowed: 25
Number of Scratch Volumes Used: 1
Delay Period for Volume Reuse: 0 Day(s)
Migration in Progress?: No
Amount Migrated (MB): 0.00
Elapsed Migration Time (seconds): 0
Reclamation in Progress?: No
Last Update by (administrator): JEF
Last Update Date/Time: 08/20/2015 17:46:49
Storage Pool Data Format: Native
Copy Storage Pool(s):
Active Data Pool(s):
Continue Copy on Error?: Yes
CRC Data: No
Reclamation Type: Threshold
Overwrite Data when Deleted:
Deduplicate Data?: Yes
Processes For Identifying Duplicates: 1
Duplicate Data Not Stored: 0 (0%)
Auto-copy Mode: Client
Contains Data Deduplicated by Client?: No
Deduplicate Requires Backup?:
The reclaim threshold on 100% is normal, this is to prevent reclamation during backup. Reclamation is started daily in the maintanence scipt.
Any help on this is very welcome. I have searched the net but did not really find anything that fixed the issue.

Asked IBM about this issue. TSM needs at least 1 pool without deduplication otherwise it will never reclaim the primary dedup pool even when the data is expired.
On the test envirement i had both the primary and copy pool on deduplication.
I created a non dedup copy pool and reclaims now work normal.

Related

Media and Data Integrity Errors

I was wondering if anyone can tell me what these mean. From most people posting about them, there is no more than double digits. However, I have 1051556645921812989870080 Media and Data Integrity Errors on my SK hynix PC711 on my new HP dev one. Thanks!
Here's my entire smartctl output
`smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.0.7-arch1-1] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: SK hynix PC711 HFS001TDE9X073N
Serial Number: KDB3N511010503A37
Firmware Version: HPS0
PCI Vendor/Subsystem ID: 0x1c5c
IEEE OUI Identifier: 0xace42e
Total NVM Capacity: 1,024,209,543,168 [1.02 TB]
Unallocated NVM Capacity: 0
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,024,209,543,168 [1.02 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: ace42e 00254f98f1
Local Time is: Wed Nov 9 13:58:37 2022 EST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x1e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 84 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Namespace 1 Features (0x02): NA_Fields
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.3000W - - 0 0 0 0 5 5
1 + 2.4000W - - 1 1 1 1 30 30
2 + 1.9000W - - 2 2 2 2 100 100
3 - 0.0500W - - 3 3 3 3 1000 1000
4 - 0.0040W - - 3 3 3 3 1000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
1 - 4096 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 34 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 0%
Data Units Read: 13,162,025 [6.73 TB]
Data Units Written: 3,846,954 [1.96 TB]
Host Read Commands: 156,458,059
Host Write Commands: 128,658,566
Controller Busy Time: 116
Power Cycles: 273
Power On Hours: 126
Unsafe Shutdowns: 15
Media and Data Integrity Errors: 1051556645921812989870080
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 34 Celsius
Temperature Sensor 2: 36 Celsius
Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged`

Encountered a similar SMART reading from the same model.
I'm seeing a reported Media and Data Integrity Errors rate of a value that's over 2 ^ 84.
It could just be an error with its SMART implementation or the utility reading from it.
Converting your reported value of 1051556645921812989870080 to hex, we get 0xdead0000000000000000 big endian and 0x0000000000000000adde little endian.
Similarly, when I convert my value to hex, I get 0xffff0000000000000000 big endian and 0x0000000000000000ffff little endian, where f is just denotes a value other than 0.
I'm going to assume that the Media and Data Integrity Errors value has no actual meaning with regard to real errors. I doubt that both of us would have values that are padded with 16 0's when converted to hex. Something is sending/receiving/parsing bad data.
If you poke around the other reported SMART values in your post, and on my end, some of them don't seem to make much sense, either.

How to compare a value against a column value containing csv in Postgres?

I have a table called device_info that looks like below (only a sample provided)
device_ip
cpu
memory
100.33.1.0
10.0
29.33
110.35.58.2
3.0, 2.0
20.47
220.17.58.3
4.0, 3.0
23.17
30.13.18.8
-1
26.47
70.65.18.10
-1
20.47
10.25.98.11
5.0, 7.0
19.88
12.15.38.10
7.0
22.45
Now I need to compare a number say 3 against the cpu column values and get the rows that are greater than that. Since the cpu column values are stored as a csv, I am not sure how to do the comparison.
I found there is a concept called string_to_array in Postgres which converts csv to array and accordingly tried the below query that didn't work out
select device_ip, cpu, memory
from device_info
where 3 > any(string_to_array(cpu, ',')::float[]);
What am I doing wrong?
Expected output
device_ip
cpu
memory
100.33.1.0
10.0
29.33
220.17.58.3
4.0, 3.0
23.17
10.25.98.11
5.0, 7.0
19.88
12.15.38.10
7.0
22.45

The statement as-is is saying "3 is greater than my array value". What I think you want is "3 is less than my array value".
Switch > to <.
select device_ip, cpu
from device_info
where 3 < any(string_to_array(cpu, ',')::float[]);
View on DB Fiddle

TEZ mapper resource request

We recently migrated from MapReduce to TEZ for executing Hive queries on EMR. We are seeing cases where for the exact hive query launches very different number of mappers. See Map 3 phase below. On the first run it requested for 305 resources and on another run it requested for 4534 mappers. ( Please ignore the KILLED status because I manually killed the query.) Why does this happen ? How can we change it to be based on underlying data size instead ?
Run 1
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 container KILLED 5 0 0 5 0 0
Map 3 container KILLED 305 0 0 305 0 0
Map 5 container KILLED 16 0 0 16 0 0
Map 6 container KILLED 1 0 0 1 0 0
Reducer 2 container KILLED 333 0 0 333 0 0
Reducer 4 container KILLED 796 0 0 796 0 0
----------------------------------------------------------------------------------------------
VERTICES: 00/06 [>>--------------------------] 0% ELAPSED TIME: 14.16 s
----------------------------------------------------------------------------------------------
Run 2
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 5 5 0 0 0 0
Map 3 container KILLED 4534 0 0 4534 0 0
Map 5 .......... container SUCCEEDED 325 325 0 0 0 0
Map 6 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 container KILLED 333 0 0 333 0 0
Reducer 4 container KILLED 796 0 0 796 0 0
----------------------------------------------------------------------------------------------
VERTICES: 03/06 [=>>-------------------------] 5% ELAPSED TIME: 527.16 s
----------------------------------------------------------------------------------------------

This article explains the process in which Tez allocates resources. https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works
If Tez grouping is enabled for the splits, then a generic grouping
logic is run on these splits to group them into larger splits. The
idea is to strike a balance between how parallel the processing is and
how much work is being done in each parallel process.
First, Tez tries to find out the resource availability in the cluster for these tasks. For that, YARN provides a headroom value (and
in future other attributes may be used). Lets say this value is T.
Next, Tez divides T with the resource per task (say M) to find out how many tasks can run in parallel at one (ie in a single wave). W =
T/M.
Next W is multiplied by a wave factor (from configuration - tez.grouping.split-waves) to determine the number of tasks to be used.
Lets say this value is N.
If there are a total of X splits (input shards) and N tasks then this would group X/N splits per task. Tez then estimates the size of
data per task based on the number of splits per task.
If this value is between tez.grouping.max-size & tez.grouping.min-size then N is accepted as the number of tasks. If
not, then N is adjusted to bring the data per task in line with the
max/min depending on which threshold was crossed.
For experimental purposes tez.grouping.split-count can be set in configuration to specify the desired number of groups. If this config
is specified then the above logic is ignored and Tez tries to group
splits into the specified number of groups. This is best effort.
After this the grouping algorithm is executed. It groups splits by node locality, then rack locality, while respecting the group size
limits.

How do I calculate sum of values in a column with different units?

Date From To Upload Download Total
03/12/15 00:53:52 01:53:52 407 KB 4.55 MB 4.94 MB
01:53:51 02:53:51 68.33 MB 1.60 GB 1.66 GB
02:53:51 03:53:51 95.39 MB 2.01 GB 2.10 GB
03:53:50 04:53:50 0 KB 208 KB 209 KB
04:53:50 05:53:50 0 KB 10 KB 11 KB
05:53:49 06:53:49 0 KB 7 KB 7 KB
06:53:49 07:53:49 370 KB 756 KB 1.10 MB
07:53:48 08:53:48 2.69 MB 64.05 MB 66.74 MB
I have this data in a spreadsheet. The last column contains total data usage in an hour. I would like to add all data used in a day in GB. The total data usage as you can see varies. It has KB, MB and GB.
How can I do it in LibreOffice Calc?

Converting all the totals into kilobytes and then summing the column of kilobytes seems like the most straightforward method.
Assuming your "Total" column is column F, and the entries in this column are text (and not numbers formatted to have the varies byte size indicators on the end), this formula will convert GB into KB:
=IF(RIGHT(F2,2)="GB",1048576*VALUE(LEFT(F2,LEN(F2)-3)),"Not a GB entry")
The IF function takes parameters IF(Test is True, Then Do This, Else Do That). In this case we are telling Calc:
IF the right two characters in this string are "GB"
THEN take the left characters minus three, convert the string into a number with VALUE, and multiply by 1,045,576
ELSE give an error message
You want to handle GB, MB, and KB, which requires nested IF statements like so:
=IF(RIGHT(F2,2)="GB",1048576*VALUE(LEFT(F2,LEN(F2)-3)),IF(RIGHT(F2,2)="MB",1024*VALUE(LEFT(F2,LEN(F2)-3)),IF(RIGHT(F2,2)="KB",VALUE(LEFT(F2,LEN(F2)-3)),"No byte size given")))
Copy and paste the formula down however long your column is. Then SUM over the calculated KB values.

This is correct formula for G, M, K suffixies, value getting from B2 cell:
=IF(RIGHT(B2;1)="G";1048576*VALUE(LEFT(B2;LEN(B2)-1));IF(RIGHT(B2;1)="M";1024*VALUE(LEFT(B2;LEN(B2)-1));IF(RIGHT(B2;1)="K";VALUE(LEFT(B2;LEN(B2)-1));"No byte size given")))

Having trouble with indexing in Neo4j

I have a dataset with the following details:
1.4 million nodes
2.9 million relationships
15 million properties (including gender, name, subscriber_id etc)
1 relationship type (Contacted)
I've batch imported the data into the database on my machine (64 bit, 16 core, 16 GB RAM) using https://github.com/jexp/batch-import/tree/20
I'm trying to index these nodes on Subscriber_ID, but I'm not really sure what I'm doing.
I ran
start n = node(*) set n:Subscribers
My understanding is this creates a label for each of the nodes (is this correct)
Next I ran
create index on :Subscribers(SUBSCRIBER_ID)
Which I think should create an index for all nodes with the 'Subscribers' label on the property 'SUBSCRIBER_ID'. (correct?)
Now when I go to Neo4j-sh and run
neo4j-sh (?)$ schema
==> Indexes
==> ON :Subscribers(SU_SUBSCRIBER_ID) ONLINE
==>
==> No constraints
But when I run the following it says there are no indices set for the nodes.
neo4j-sh (?)$ index --indexes
==> Node indexes:
==>
==> Relationship indexes:
I have a few questions
Do I have to tell it to index the existing data? If so how do I do
that?
How can I then use the index? I've read through the
documentation but I had a bit of trouble following it.
It looks
like I can have the indexes set up when I run the batch import
script, but I can't really understand how... could someone explain
please?
Here's an example of my data:
Nodes.txt
id SU_SUBSCRIBER_ID CU_FIRST_NAME gender SU_AGE
0 123456 Ann F 56
1 832746 ? UNKNOWN -1
2 546765 Tom UNKNOWN -1
3 768345 Anges F 72
4 267854 Aoibhlinn F 38
rels.csv
start end rel counter
0 3 CONTACTED 2
1 2 CONTACTED 1
1 4 CONTACTED 1
3 2 CONTACTED 2
4 1 CONTACTED 1

schema is the right command to look at.
Cypher uses the label indexes automatically for MERGE and MATCH.
With the Java Core-API you'd use db.findNodesByLabelAndProperty(label,property,value)
You did the right thing, except for one. You could have created the labels on the nodes while doing the batch-import.
Just add a l:label field to your CSV-file containing a comma separated list of labels per node. Like shown in the readme on that branch.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas