lundi 22 décembre 2014

Concurrent SELECT and UPDATE in multiprocessing python


I have to randomly match about 20 million rows in two tables without using the same row twice using PostgreSQL 9.3.5. Both tables have the following structure



CREATE TABLE de_sim_points_end
(
-- Inherited from table de_sim_points: id integer NOT NULL DEFAULT nextval('de_sim_points_id_seq'::regclass),
-- Inherited from table de_sim_points: parent_geometry character varying(12),
-- Inherited from table de_sim_points: used boolean DEFAULT false,
-- Inherited from table de_sim_points: geom geometry(Point,900913),
CONSTRAINT de_sim_points_end_pkey PRIMARY KEY (id)
)
INHERITS (de_sim_points)
WITH (
FILLFACTOR=50,
OIDS=FALSE
);
ALTER TABLE de_sim_points_end
OWNER TO benjamin;

CREATE INDEX de_sim_points_end_geom_idx
ON de_sim_points_end
USING gist
(geom)
WITH (FILLFACTOR=100);
ALTER TABLE de_sim_points_end CLUSTER ON de_sim_points_end_geom_idx;

CREATE INDEX de_sim_points_end_parent_relation_idx
ON de_sim_points_end
USING btree
(parent_geometry COLLATE pg_catalog."default")
WITH (FILLFACTOR=100);

CREATE INDEX de_sim_points_end_used_idx
ON de_sim_points_end
USING btree
(used)
WITH (FILLFACTOR=50);


The SQL Query and Python Code used to match the rows can be found at http://ift.tt/1CqeZIq (I will not post it here due to its length). But basically what I do is:



  1. SELECT rows, randomly ordered from start point table

  2. JOIN rows, randomly ordered from destination point table

  3. Take the result, iterate over it and update the points in start and destination table as used

  4. Save the result in another table (done by another thread)


When I was using multiple processes with SELECT ... FOR UPDATE I ran into deadlocks (currently I just use one process, so the FOR UPDATE statement is currently omitted). So far I have read a bit about SERIALIZABLE and so forth locking mechanisms, but yet I fail to fully grasp them.


Is it possible to make the matching multiprocessing compatible? How whould the statements look like?





Aucun commentaire:

Enregistrer un commentaire