Viacheslav Eremin | Use PostgreSQL internal ctid to repair DB, to delete last records.

(SQL) SQL (2014)

Use PostgreSQL internal ctid to repair DB, to delete last records.

I'm faced with strange trouble, after I have restored PostgreSQL table with this structure:

postgres=# \d+ vbnet.entry
                                                                     Table "vbnet.entry"
   Column    |         Type          | Collation | Nullable |                 Default                  | Storage  | Compression | Stats target | Description
-------------+-----------------------+-----------+----------+------------------------------------------+----------+-------------+--------------+-------------
 i           | bigint                |           | not null | nextval('vbnet."Entry_i_seq"'::regclass) | plain    |             |              |
 parent      | character varying(33) |           | not null |                                          | extended |             |              |
 id          | character varying(33) |           | not null |                                          | extended |             |              |
 kind        | character varying     |           | not null |                                          | extended |             |              |
 name        | character varying     |           | not null |                                          | extended |             |              |
 mime        | character varying     |           | not null |                                          | extended |             |              |
 lastrequest | date                  |           |          |                                          | plain    |             |              |
 laststatus  | character varying     |           |          |                                          | extended |             |              |
 length      | numeric               |           |          |                                          | main     |             |              |
 type        | character varying     |           |          |                                          | extended |             |              |
 encoding    | character varying     |           |          |                                          | extended |             |              |
Access method: heap

maybe I forget setup start index in sequence "Entry_i_seq":

SELECT nextval('vbnet."Entry_i_seq"');
1

And when my program start adding data, I received duplicate records:

SELECT i, COUNT(*) 
FROM vbnet.entry 
GROUP BY i 
HAVING COUNT(*) > 1
ORDER BY i;
1	2
2	2
3	2
4	2
5	2
6	2
7	2
8	2
9	2
10	2
11	2
12	2
13	2
14	2
15	2
16	2
17	2
18	2
19	2
20	2
21	2
22	2
23	2
24	2
25	2
26	2
27	2
28	2
29	2
30	2
31	2
32	2
33	2
34	2
35	2
36	2
37	2
38	2
39	2
40	2
41	2
42	2
43	2
44	2
45	2
46	2
47	2
48	2
49	2
50	2
51	2
52	2
53	2
54	2
55	2
56	2
57	2
58	2
59	2
60	2
61	2
62	2
63	2
64	2
65	2
66	2
67	2
68	2
69	2
70	2
71	2

and: (This strange output format means- ctid (physical row location) plus (page,row) shows where each record is stored on disk.)

SELECT i, MIN(ctid) as oldest_record
FROM vbnet.entry
WHERE i IN (SELECT i FROM vbnet.entry GROUP BY i HAVING COUNT(*) > 1)
GROUP BY i
29	"(0,29)"
6	"(0,6)"
56	"(1,7)"
20	"(0,20)"
39	"(0,39)"
37	"(0,37)"
36	"(0,36)"
61	"(1,12)"
14	"(0,14)"
51	"(1,2)"
31	"(0,31)"
65	"(1,16)"
43	"(0,43)"
21	"(0,21)"
30	"(0,30)"
41	"(0,41)"
59	"(1,10)"
48	"(0,48)"
53	"(1,4)"
70	"(1,21)"
28	"(0,28)"
64	"(1,15)"
22	"(0,22)"
54	"(1,5)"
18	"(0,18)"
58	"(1,9)"
7	"(0,7)"
17	"(0,17)"
35	"(0,35)"
45	"(0,45)"
71	"(1,22)"
24	"(0,24)"
52	"(1,3)"
23	"(0,23)"
19	"(0,19)"
2	"(0,2)"
66	"(1,17)"
62	"(1,13)"
40	"(0,40)"
1	"(0,1)"
42	"(0,42)"
11	"(0,11)"
27	"(0,27)"
8	"(0,8)"
47	"(0,47)"
67	"(1,18)"
63	"(1,14)"
13	"(0,13)"
44	"(0,44)"
5	"(0,5)"
69	"(1,20)"
15	"(0,15)"
26	"(0,26)"
46	"(0,46)"
49	"(0,49)"
38	"(0,38)"
50	"(1,1)"
9	"(0,9)"
34	"(0,34)"
16	"(0,16)"
55	"(1,6)"
10	"(0,10)"
12	"(0,12)"
4	"(0,4)"
60	"(1,11)"
33	"(0,33)"
57	"(1,8)"
32	"(0,32)"
68	"(1,19)"
25	"(0,25)"
3	"(0,3)"

But the correct sequence should be:

SELECT nextval('vbnet."Entry_i_seq"'); 
165933

However I can easy restored my DB, because each records has internal PostgreSQL ctid number, physical row identifier, and this row identifier allow me delete last records with duplicated index, so this is my solution:

-- STEP 1: Create a backup (just in case)
CREATE TABLE vbnet.entry_backup AS SELECT * FROM vbnet.entry;

-- STEP 2: Delete all duplicates (keeping the oldest copy of each)
DELETE FROM vbnet.entry
WHERE ctid NOT IN (
  SELECT MIN(ctid)
  FROM vbnet.entry
  GROUP BY i
);

-- STEP 3: Verify only one record per i remains
SELECT i, COUNT(*) FROM vbnet.entry GROUP BY i HAVING COUNT(*) > 1; -- Should return 0 rows now

-- STEP 4: Reset the sequence to prevent future conflicts
SELECT setval('vbnet."Entry_i_seq"', (SELECT MAX(i) FROM vbnet.entry), true);

-- STEP 5: And verify next insert will work correctly
SELECT MAX(i) FROM vbnet.entry;
SELECT nextval('vbnet."Entry_i_seq"');  -- Should return MAX(i)+1
SELECT COUNT(*) FROM vbnet.entry;

Very simple task, based on physical row identifier, but MySQL and MsSql has no unique records number what increments each time after record adding.

Sql context:

Comments (

)

Link to this page: http://www.vb-net.com/PostgresqlCtid/Index.htm

< THANKS ME>