OpenCL vs denoiseprofile

Bonjour,

Convaincu par DT sous Windows, j’ai migré sous Linux (Mint 18.3). Sur les conseils de membres, j’ai même acheté une carte graphique Nvidia au passage, juste pour DT. Mais je ne le trouvais pas aussi rapide que sous Windows. Ma CG est une GeForce GTX 1060 3GB et OpenCL a l’air installé.

En lançant DT dans un terminal am moyen de darktable -d opencl, j’obtiens en effet les erreurs suivantes à chaque fois que j’exporte une photo

[pixelpipe_process] [export] using device 0 [default_process_tiling_cl_ptp] use tiling on module 'denoiseprofile' for image with full size 5342 x 3688 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 5224 x 3688 and overlap 32 [default_process_tiling_cl_ptp] tile (0, 0) with 5224 x 3688 at origin [0, 0] [opencl_denoiseprofile] couldn't enqueue kernel! -4, devid 0 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'denoiseprofile' in tiling mode: 0 [opencl_pixelpipe] could not run module 'denoiseprofile' on gpu. falling back to cpu path

Il n’y a que denoiseprofile qui a l’air concerné. Comme indiqué ici https://www.darktable.org/2012/03/darktable-and-opencl/, j’ai augmenté opencl_memory_headroom pour le porter à 512. Pour info, avant d’arriver à cette erreur, au lancement, j’ai :

[code][opencl_init] opencl related configuration options:
[opencl_init]
[opencl_init] opencl: 1
[opencl_init] opencl_library: ‹  ›
[opencl_init] opencl_memory_requirement: 768
[opencl_init] opencl_memory_headroom: 512
[opencl_init] opencl_device_priority: ‹ /!0,// ›
[opencl_init] opencl_mandatory_timeout: 200
[opencl_init] opencl_size_roundup: 16
[opencl_init] opencl_async_pixelpipe: 0
[opencl_init] opencl_synch_cache: 0
[opencl_init] opencl_number_event_handles: 25
[opencl_init] opencl_micro_nap: 1000
[opencl_init] opencl_use_pinned_memory: 0
[opencl_init] opencl_use_cpu_devices: 0
[opencl_init] opencl_avoid_atomics: 0
[opencl_init]
[opencl_init] could not find opencl runtime library ‹ libOpenCL ›
[opencl_init] could not find opencl runtime library ‹ libOpenCL.so ›
[opencl_init] found opencl runtime library ‹ libOpenCL.so.1 ›
[opencl_init] opencl library ‹ libOpenCL.so.1 › found on your system and loaded
[opencl_init] found 1 platform
[opencl_init] found 1 device
[opencl_init] device 0 GeForce GTX 1060 3GB' has sm_20 support. [opencl_init] device 0 GeForce GTX 1060 3GB’ supports image sizes of 16384 x 32768
[opencl_init] device 0 GeForce GTX 1060 3GB' allows GPU memory allocations of up to 753MB [opencl_init] device 0: GeForce GTX 1060 3GB GLOBAL_MEM_SIZE: 3013MB MAX_WORK_GROUP_SIZE: 1024 MAX_WORK_ITEM_DIMENSIONS: 3 MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ] DRIVER_VERSION: 384.111 DEVICE_VERSION: OpenCL 1.2 CUDA [opencl_init] options for OpenCL compiler: -cl-fast-relaxed-math -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/usr/share/darktable/kernels" [opencl_init] compiling program demosaic_ppg.cl’ ..
[opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/demosaic_ppg.cl.bin' [opencl_load_program] successfully loaded program from /usr/share/darktable/kernels/demosaic_ppg.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program atrous.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/atrous.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/atrous.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program basic.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/basic.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/basic.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program blendop.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/blendop.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/blendop.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program highpass.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/highpass.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/highpass.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program nlmeans.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/nlmeans.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/nlmeans.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program gaussian.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/gaussian.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/gaussian.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program sharpen.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/sharpen.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/sharpen.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program extended.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/extended.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/extended.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program soften.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/soften.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/soften.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program bilateral.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/bilateral.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/bilateral.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program denoiseprofile.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/denoiseprofile.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/denoiseprofile.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program bloom.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/bloom.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/bloom.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program colorreconstruction.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/colorreconstruction.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/colorreconstruction.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program demosaic_other.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/demosaic_other.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/demosaic_other.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program demosaic_vng.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/demosaic_vng.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/demosaic_vng.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program demosaic_markesteijn.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/demosaic_markesteijn.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/demosaic_markesteijn.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program liquify.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/liquify.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/liquify.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program basecurve.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/basecurve.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/basecurve.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] compiling program locallaplacian.cl' .. [opencl_load_program] loaded cached binary program from file /home/pierre/.cache/darktable/cached_kernels_for_GeForceGTX10603GB/locallaplacian.cl.bin’
[opencl_load_program] successfully loaded program from `/usr/share/darktable/kernels/locallaplacian.cl’
[opencl_build_program] successfully built program
[opencl_build_program] BUILD STATUS: 0
BUILD LOG:

[opencl_init] kernel loading time: 0.0128
[opencl_init] OpenCL successfully initialized.
[opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
[opencl_init] 0 ‹ GeForce GTX 1060 3GB ›
[opencl_init] FINALLY: opencl is AVAILABLE on this system.
[opencl_init] initial status of opencl enabled flag is ON.
[opencl_create_kernel] successfully loaded kernel blendop_mask_Lab' (0) for device 0 [opencl_create_kernel] successfully loaded kernel blendop_mask_RAW’ (1) for device 0
[opencl_create_kernel] successfully loaded kernel blendop_mask_rgb' (2) for device 0 [opencl_create_kernel] successfully loaded kernel blendop_Lab’ (3) for device 0
[opencl_create_kernel] successfully loaded kernel blendop_RAW' (4) for device 0 [opencl_create_kernel] successfully loaded kernel blendop_rgb’ (5) for device 0
[opencl_create_kernel] successfully loaded kernel blendop_set_mask' (6) for device 0 [opencl_create_kernel] successfully loaded kernel blendop_display_channel’ (7) for device 0
[opencl_create_kernel] successfully loaded kernel zero' (8) for device 0 [opencl_create_kernel] successfully loaded kernel splat’ (9) for device 0
[opencl_create_kernel] successfully loaded kernel blur_line' (10) for device 0 [opencl_create_kernel] successfully loaded kernel blur_line_z’ (11) for device 0
[opencl_create_kernel] successfully loaded kernel slice' (12) for device 0 [opencl_create_kernel] successfully loaded kernel slice_to_output’ (13) for device 0
[opencl_create_kernel] successfully loaded kernel gaussian_column_1c' (14) for device 0 [opencl_create_kernel] successfully loaded kernel gaussian_transpose_1c’ (15) for device 0
[opencl_create_kernel] successfully loaded kernel gaussian_column_4c' (16) for device 0 [opencl_create_kernel] successfully loaded kernel gaussian_transpose_4c’ (17) for device 0
[opencl_create_kernel] successfully loaded kernel interpolation_resample' (18) for device 0 [opencl_create_kernel] successfully loaded kernel pad_input’ (19) for device 0
[opencl_create_kernel] successfully loaded kernel gauss_expand' (20) for device 0 [opencl_create_kernel] successfully loaded kernel gauss_reduce’ (21) for device 0
[opencl_create_kernel] successfully loaded kernel laplacian_assemble' (22) for device 0 [opencl_create_kernel] successfully loaded kernel process_curve’ (23) for device 0
[opencl_create_kernel] successfully loaded kernel write_back' (24) for device 0 [opencl_priorities] these are your device priorities: [opencl_priorities] image preview export thumbnail [opencl_priorities] 0 -1 0 0 [opencl_priorities] show if opencl use is mandatory for a given pixelpipe: [opencl_priorities] image preview export thumbnail [opencl_priorities] 0 0 0 0 [opencl_synchronization_timeout] synchronization timeout set to 200 [opencl_create_kernel] successfully loaded kernel shadows_highlights_mix’ (25) for device 0
[opencl_create_kernel] successfully loaded kernel tonecurve' (26) for device 0 [opencl_create_kernel] successfully loaded kernel colorzones’ (27) for device 0
[opencl_create_kernel] successfully loaded kernel clip_rotate_bilinear' (28) for device 0 [opencl_create_kernel] successfully loaded kernel clip_rotate_bicubic’ (29) for device 0
[opencl_create_kernel] successfully loaded kernel clip_rotate_lanczos2' (30) for device 0 [opencl_create_kernel] successfully loaded kernel clip_rotate_lanczos3’ (31) for device 0
[opencl_create_kernel] successfully loaded kernel whitebalance_4f' (32) for device 0 [opencl_create_kernel] successfully loaded kernel whitebalance_1f’ (33) for device 0
[opencl_create_kernel] successfully loaded kernel whitebalance_1f_xtrans' (34) for device 0 [opencl_create_kernel] successfully loaded kernel lowlight’ (35) for device 0
[opencl_create_kernel] successfully loaded kernel channelmixer' (36) for device 0 [opencl_create_kernel] successfully loaded kernel soften_overexposed’ (37) for device 0
[opencl_create_kernel] successfully loaded kernel soften_hblur' (38) for device 0 [opencl_create_kernel] successfully loaded kernel soften_vblur’ (39) for device 0
[opencl_create_kernel] successfully loaded kernel soften_mix' (40) for device 0 [opencl_create_kernel] successfully loaded kernel highlights_1f_clip’ (41) for device 0
[opencl_create_kernel] successfully loaded kernel highlights_1f_lch_bayer' (42) for device 0 [opencl_create_kernel] successfully loaded kernel highlights_1f_lch_xtrans’ (43) for device 0
[opencl_create_kernel] successfully loaded kernel highlights_4f_clip' (44) for device 0 [opencl_create_kernel] successfully loaded kernel zonesystem’ (45) for device 0
[opencl_create_kernel] successfully loaded kernel borders_fill' (46) for device 0 [opencl_create_kernel] successfully loaded kernel denoiseprofile_precondition’ (47) for device 0
[opencl_create_kernel] successfully loaded kernel denoiseprofile_init' (48) for device 0 [opencl_create_kernel] successfully loaded kernel denoiseprofile_dist’ (49) for device 0
[opencl_create_kernel] successfully loaded kernel denoiseprofile_horiz' (50) for device 0 [opencl_create_kernel] successfully loaded kernel denoiseprofile_vert’ (51) for device 0
[opencl_create_kernel] successfully loaded kernel denoiseprofile_accu' (52) for device 0 [opencl_create_kernel] successfully loaded kernel denoiseprofile_finish’ (53) for device 0
[opencl_create_kernel] successfully loaded kernel denoiseprofile_backtransform' (54) for device 0 [opencl_create_kernel] successfully loaded kernel denoiseprofile_decompose’ (55) for device 0
[opencl_create_kernel] successfully loaded kernel denoiseprofile_synthesize' (56) for device 0 [opencl_create_kernel] successfully loaded kernel denoiseprofile_reduce_first’ (57) for device 0
[opencl_create_kernel] successfully loaded kernel denoiseprofile_reduce_second' (58) for device 0 [opencl_create_kernel] successfully loaded kernel lowpass_mix’ (59) for device 0
[opencl_create_kernel] successfully loaded kernel rawprepare_1f' (60) for device 0 [opencl_create_kernel] successfully loaded kernel rawprepare_1f_unnormalized’ (61) for device 0
[opencl_create_kernel] successfully loaded kernel rawprepare_4f' (62) for device 0 [opencl_create_kernel] successfully loaded kernel colorcorrection’ (63) for device 0
[opencl_create_kernel] successfully loaded kernel highpass_invert' (64) for device 0 [opencl_create_kernel] successfully loaded kernel highpass_hblur’ (65) for device 0
[opencl_create_kernel] successfully loaded kernel highpass_vblur' (66) for device 0 [opencl_create_kernel] successfully loaded kernel highpass_mix’ (67) for device 0
[opencl_create_kernel] successfully loaded kernel colormapping_histogram' (68) for device 0 [opencl_create_kernel] successfully loaded kernel colormapping_mapping’ (69) for device 0
[opencl_create_kernel] successfully loaded kernel colorreconstruction_zero' (70) for device 0 [opencl_create_kernel] successfully loaded kernel colorreconstruction_splat’ (71) for device 0
[opencl_create_kernel] successfully loaded kernel colorreconstruction_blur_line' (72) for device 0 [opencl_create_kernel] successfully loaded kernel colorreconstruction_slice’ (73) for device 0
[opencl_create_kernel] successfully loaded kernel clip_and_zoom_demosaic_half_size' (74) for device 0 [opencl_create_kernel] successfully loaded kernel ppg_demosaic_green’ (75) for device 0
[opencl_create_kernel] successfully loaded kernel green_equilibration_lavg' (76) for device 0 [opencl_create_kernel] successfully loaded kernel green_equilibration_favg_reduce_first’ (77) for device 0
[opencl_create_kernel] successfully loaded kernel green_equilibration_favg_reduce_second' (78) for device 0 [opencl_create_kernel] successfully loaded kernel green_equilibration_favg_apply’ (79) for device 0
[opencl_create_kernel] successfully loaded kernel pre_median' (80) for device 0 [opencl_create_kernel] successfully loaded kernel ppg_demosaic_redblue’ (81) for device 0
[opencl_create_kernel] successfully loaded kernel clip_and_zoom' (82) for device 0 [opencl_create_kernel] successfully loaded kernel border_interpolate’ (83) for device 0
[opencl_create_kernel] successfully loaded kernel color_smoothing' (84) for device 0 [opencl_create_kernel] successfully loaded kernel passthrough_monochrome’ (85) for device 0
[opencl_create_kernel] successfully loaded kernel clip_and_zoom_demosaic_passthrough_monochrome' (86) for device 0 [opencl_create_kernel] successfully loaded kernel vng_border_interpolate’ (87) for device 0
[opencl_create_kernel] successfully loaded kernel vng_lin_interpolate' (88) for device 0 [opencl_create_kernel] successfully loaded kernel clip_and_zoom_demosaic_third_size_xtrans’ (89) for device 0
[opencl_create_kernel] successfully loaded kernel vng_green_equilibrate' (90) for device 0 [opencl_create_kernel] successfully loaded kernel vng_interpolate’ (91) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_initial_copy' (92) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_green_minmax’ (93) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_interpolate_green' (94) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_solitary_green’ (95) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_recalculate_green' (96) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_red_and_blue’ (97) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_interpolate_twoxtwo' (98) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_convert_yuv’ (99) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_differentiate' (100) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_homo_threshold’ (101) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_homo_set' (102) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_homo_sum’ (103) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_homo_max' (104) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_homo_max_corr’ (105) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_homo_quench' (106) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_zero’ (107) for device 0
[opencl_create_kernel] successfully loaded kernel markesteijn_accu' (108) for device 0 [opencl_create_kernel] successfully loaded kernel markesteijn_final’ (109) for device 0
[opencl_create_kernel] successfully loaded kernel colorin_unbound' (110) for device 0 [opencl_create_kernel] successfully loaded kernel colorin_clipping’ (111) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_lut' (112) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_zero’ (113) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_ev_lut' (114) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_compute_features’ (115) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_blur_h' (116) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_blur_v’ (117) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_expand' (118) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_reduce’ (119) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_detail' (120) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_adjust_features’ (121) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_blend_gaussian' (122) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_blend_laplacian’ (123) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_normalize' (124) for device 0 [opencl_create_kernel] successfully loaded kernel basecurve_reconstruct’ (125) for device 0
[opencl_create_kernel] successfully loaded kernel basecurve_finalize' (126) for device 0 [opencl_create_kernel] successfully loaded kernel lens_distort_bilinear’ (127) for device 0
[opencl_create_kernel] successfully loaded kernel lens_distort_bicubic' (128) for device 0 [opencl_create_kernel] successfully loaded kernel lens_distort_lanczos2’ (129) for device 0
[opencl_create_kernel] successfully loaded kernel lens_distort_lanczos3' (130) for device 0 [opencl_create_kernel] successfully loaded kernel lens_vignette’ (131) for device 0
[opencl_create_kernel] successfully loaded kernel levels' (132) for device 0 [opencl_create_kernel] successfully loaded kernel relight’ (133) for device 0
[opencl_create_kernel] successfully loaded kernel eaw_decompose' (134) for device 0 [opencl_create_kernel] successfully loaded kernel eaw_synthesize’ (135) for device 0
[opencl_create_kernel] successfully loaded kernel nlmeans_init' (136) for device 0 [opencl_create_kernel] successfully loaded kernel nlmeans_dist’ (137) for device 0
[opencl_create_kernel] successfully loaded kernel nlmeans_horiz' (138) for device 0 [opencl_create_kernel] successfully loaded kernel nlmeans_vert’ (139) for device 0
[opencl_create_kernel] successfully loaded kernel nlmeans_accu' (140) for device 0 [opencl_create_kernel] successfully loaded kernel nlmeans_finish’ (141) for device 0
[opencl_create_kernel] successfully loaded kernel colisa' (142) for device 0 [opencl_create_kernel] successfully loaded kernel splittoning’ (143) for device 0
[opencl_create_kernel] successfully loaded kernel pixelmax_first' (144) for device 0 [opencl_create_kernel] successfully loaded kernel pixelmax_second’ (145) for device 0
[opencl_create_kernel] successfully loaded kernel global_tonemap_reinhard' (146) for device 0 [opencl_create_kernel] successfully loaded kernel global_tonemap_drago’ (147) for device 0
[opencl_create_kernel] successfully loaded kernel global_tonemap_filmic' (148) for device 0 [opencl_create_kernel] successfully loaded kernel flip’ (149) for device 0
[opencl_create_kernel] successfully loaded kernel invert_1f' (150) for device 0 [opencl_create_kernel] successfully loaded kernel invert_4f’ (151) for device 0
[opencl_create_kernel] successfully loaded kernel colorout' (152) for device 0 [opencl_create_kernel] successfully loaded kernel colorcontrast’ (153) for device 0
[opencl_create_kernel] successfully loaded kernel bloom_threshold' (154) for device 0 [opencl_create_kernel] successfully loaded kernel bloom_hblur’ (155) for device 0
[opencl_create_kernel] successfully loaded kernel bloom_vblur' (156) for device 0 [opencl_create_kernel] successfully loaded kernel bloom_mix’ (157) for device 0
[opencl_create_kernel] successfully loaded kernel colorchecker' (158) for device 0 [opencl_create_kernel] successfully loaded kernel colorize’ (159) for device 0
[opencl_create_kernel] successfully loaded kernel sharpen_hblur' (160) for device 0 [opencl_create_kernel] successfully loaded kernel sharpen_vblur’ (161) for device 0
[opencl_create_kernel] successfully loaded kernel sharpen_mix' (162) for device 0 [opencl_create_kernel] successfully loaded kernel warp_kernel’ (163) for device 0
[opencl_create_kernel] successfully loaded kernel overexposed' (164) for device 0 [opencl_create_kernel] successfully loaded kernel monochrome_filter’ (165) for device 0
[opencl_create_kernel] successfully loaded kernel monochrome' (166) for device 0 [opencl_create_kernel] successfully loaded kernel ashift_bilinear’ (167) for device 0
[opencl_create_kernel] successfully loaded kernel ashift_bicubic' (168) for device 0 [opencl_create_kernel] successfully loaded kernel ashift_lanczos2’ (169) for device 0
[opencl_create_kernel] successfully loaded kernel ashift_lanczos3' (170) for device 0 [opencl_create_kernel] successfully loaded kernel vignette’ (171) for device 0
[opencl_create_kernel] successfully loaded kernel colorbalance' (172) for device 0 [opencl_create_kernel] successfully loaded kernel velvia’ (173) for device 0
[opencl_create_kernel] successfully loaded kernel vibrance' (174) for device 0 [opencl_create_kernel] successfully loaded kernel graduatedndp’ (175) for device 0
[opencl_create_kernel] successfully loaded kernel graduatedndm' (176) for device 0 [opencl_create_kernel] successfully loaded kernel rawoverexposed_mark_cfa’ (177) for device 0
[opencl_create_kernel] successfully loaded kernel rawoverexposed_mark_solid' (178) for device 0 [opencl_create_kernel] successfully loaded kernel rawoverexposed_falsecolor’ (179) for device 0
[opencl_create_kernel] successfully loaded kernel profilegamma' (180) for device 0 [opencl_create_kernel] successfully loaded kernel exposure’ (181) for device 0[/code]

Est-ce qu’il y a quelques chose à faire pour avoir un DT tout rapide ?

EDIT

[hr]
Je suis débile ! En augmentant opencl_memory_headroom dans darktablerc jusque 1024, j’arrive finalement à

[pixelpipe_process] [export] using device 0 [default_process_tiling_cl_ptp] use tiling on module 'denoiseprofile' for image with full size 5342 x 3688 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 4204 x 3688 and overlap 32 [default_process_tiling_cl_ptp] tile (0, 0) with 4204 x 3688 at origin [0, 0] [default_process_tiling_cl_ptp] tile (1, 0) with 1202 x 3688 at origin [4140, 0]

L’augmentation est peut-être rendue nécessaire par le fait que j’utilse un écran 4K.

Pour poser une question quant même : quels paramètres me conseilleriez vous dans « fonctionnement » pour ces histoires de tuiles, ainsi que les autres paramètres impliquant la mémoire ?

[list]
[]Ma CG a 3GB de mémoire
[
]En RAM, le PC a 16GB
[/list]

Merci @temperdu, ton sujet m’a permis de vérifier et corriger des valeurs d’opencl_memory_* dans mon darktablerc.

Je n’ai pas d’écran 4K et j’ai également du monter à 800 pour headroom pour ne pas « retomber » sur le CPU. (nVidia GK107GLM [Quadro K1000M] avec 2Go RAM). J’ai cependant deux écrans connectés.

Je n’ai pas trouvé d’informations sur les paramètres opencl_ qu’on a dans darktablerc. J’aurais pourtant bien aimé avoir des explications au delà de headroom dont le lien que tu cites précise :

Car j’ai mis 800 « à tâtons » sur un export jpg répété avec des valeurs différentes et en contrôlant la sortie du -d opencl, mais comment savoir si cette valeur est correcte pour d’autres traitements, à part contrôler au coup par coup la console ?

@temperdu, veux tu mettre dans des boites « code » plustôt que « citation ».

« Est-ce qu’il y a quelques chose à faire pour avoir un DT tout rapide ? »

Il faut ajuster les paramètres à tâtons en mesurant les temps d’exportation sur une image de référence. Il n’y a aucune règle absolue, ça dépend de la carte graphique et de la version de son pilote. Je me souviens que le driver Nvidia 375 donne les meilleurs résultats chez moi, avec des paramètres différents de ceux qui marchaient le mieux avec Nvidia 340.

Bon courage…

Ici : https://www.darktable.org/usermanual/fr/darktable_and_opencl_optimization.html (et c’est même en français).

Bons tâtonnements comme le recommande Aurélien Pierre !

Merci pour la réédition de ton premier post.

Vos échanges ainsi qu’une discussion sur la liste darktable-users m’ont poussé à aller vérifier ma configuration.

J’ai pour cela utilisé les outils cités dans les mail de la liste. À savoir, un raw de test avec son fichier xmp qu’on peut télécharger
$ wget http://www.mirada.ch/bench.SRW
$ wget http://www.mirada.ch/bench.SRW.xmp

Ainsi que les lignes de commandes pour l’utiliser
$ darktable-cli bench.SRW test.jpg --core --disable-opencl -d perf
$ darktable-cli bench.SRW test.jpg --core -d perf -d opencl

En lançant la première ligne (sans opencl), le traitement s’effectue en environ [color=#ff3333]37 secondes[/color]

J’ai ensuite fait un essai avec opencl, mais en regardant le log du traitement, je me suis aperçu qu’un module n’était pas traité avec opencl, il s’agit du module atrous (l’égaliseur en français). Le message d’erreur était le suivant :

default_process_tiling_cl_ptp] use tiling on module 'atrous' for image with full size 5490 x 3660 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 4400 x 3660 and overlap 256 [default_process_tiling_cl_ptp] tile (0, 0) with 4400 x 3660 at origin [0, 0] [opencl_atrous] couldn't enqueue kernel! -4 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'atrous' in tiling mode: 0 [opencl_pixelpipe] could not run module 'atrous' on gpu. falling back to cpu path
Le temps de traitement dans ce cas était d’environ [color=#ff3333]15 secondes[/color]
J’ai alors tâtonné avec toutes les variables opencl_xxxx pour finir par trouver que je devais augmenter la valeur de opencl_memory_headroom. Sa valeur est par défaut de 300. Et contrairement à ce qu’on pourrait penser en lisant la description de cette variable, il ne faut pas réduire sa valeur, mais l’augmenter pour faire en sorte que tous les modules soient traités avec opencl. Cependant il ne faut pas trop l’augmenter. sinon tout est bien traité en opencl, mais les performances diminuent. En clair quand je l’augmente juste ce qu’il faut pour que tous les modules soient traités avec opencl, soit chez moi, 500, j’obtiens un temps de traitement d’environ [color=#ff3333]8,5 secondes[/color]. alors que lorsque j’augmente fortement la valeur, par exemple à 2000 le temps approche les 10 secondes. Si on examine le log, on s’aperçoit qu’à cette valeurs de nombreux modules se mettent à découper leur traitement en blocs successifs (des tuiles dans le jargon darktable), ce qui n’est pas le cas à 500. Ma carte graphique ne dispose que de trois gigaoctets de mémoire. ce qui doit expliquer le phénomène

Conclusion, il faut augmenter la valeur de opencl_memory_headroom juste assez pour que tous les modules soient traité en opencl, mais pas plus. Chez moi, c’est 500

Je ne suis pas certain que cela soit le meilleur test. En fait ce qui me semble le plus important c’est l’interactivité. Avec OpenCL et le module liquéfier par exemple en interactif il y a une grosse différence lorsque l’on navigue dans l’image. D’une manière générale j’ai plusieurs fois détecté que l’OpenCL n’était pas activé sur ma machine pendant l’utilisation de darktable. Et pourtant lorsque je fais le test d’export (qui traite TOUTE l’image) comme toi ci-dessus j’ai des perfs équivalentes ou moins bonnes avec OpenCL. Mais finalement je pense que la vitesse de l’exportation est bien moins importante que la fluidité des traitements. J’exporte à la fin et je peux aller prendre un café pendant que la machine tourne :slight_smile:

Sinon comme toi, en passant à 500 les performances sont meilleures car plus de modules passent en GPU.

la vitesse d’exportation compte autant que l’interactivité quand tu te retrouves à exporter 200 photos d’un mariage en 36 Mpix (à haut ISO, donc débruitage à fond).

ça m’avait pris plusieurs heures…

Merci ! :slight_smile:

Bon y’a pas tous les paramètres, et c’est pas super-probants sur les tests que j’ai pu réaliser jusqu’alors.
Il faut un carte puissante et dotée de RAM pour sentir vraiment les effets d’openCL.

Tout à fait d’accord, la fluidité est aussi importante que la vitesse d’exportation. En ce sens, ce petit test qui n’est pas très scientifique, permet cependant de mettre facilement en évidence un défaut de configuration. Une fois celui-ci corrigé, l’exportation ET la fluidité en seront améliorées.

Merci JPVerrue. J’ai fait le test. Je passe de 27s à 5s. Mais je dois mettre opencl_memory_headroom à 1150 au moins avec mon écran 4K.

Est-ce que quelqu’un sait à quoi correspond darktablerc: host_memory_limit ? Il y a une explication ici mais je ne comprends pas bien : https://www.darktable.org/2012/03/darktable-and-memory/

Il vaut mieux que ce soit petit ou grand ?

il vaut mieux que ça soit grand, dans la limite que ce que ton système supporte.

Si c’est trop petit, darktable découpe ton image en tuiles (tiles en EN) qu’il traite séparément. Plus c’est petit, plus tu as de tuiles. Et le tuilage consomme du processeur (overhead en EN). Du coup, moins de tuiles = moins d’overhead = plus de rapidité. C’est un compromis classique où la minimisation de l’espace RAM maximise le nombre de calculs, et réciproquement (en gros).

Avec la photo de bateau, j’avais finalement dû monter à 1280 parce que parfois, de manière non prévisible (en tous cas par moi), un module n’employait pas le GPU. Mais j’ai poursuivi mes expérimentations et avec des photos de 24MP issues d’un Fuji, et j’ai dû monter jusque

opencl_memory_headroom=1664

Avec une carte de 3Go, ça tuile un peu mais quand même, c’est très efficace. Sur une image très bruitée avec un traitement du bruit modéré (un module pour chroma et un autre pour luminance, le dernier avec un masque surtout sur les ombres), je passe de 29s à 8s, ce qui n’est pas si mal. Sur des images moins bruités, je suis à 5s. En laissant juste un seul denoise-profile sur le CPU (un i5@3.4Ghz), les 5s se transforment en 11s.

Moralité : ne testez pas que sur le bâteau mais sur les images que vous traitez en vrai, et plutôt avec des images avec bcp de bruit traité et des modules gourmands. Prenez donc un peu de marge. AMHA ça vaut le coup de se laisser un poil de marge quitte à tuiler un peu. On perd très légèrement à chaque coup, mais on est sûr d’éviter les pires des cas où là, ça peut être pénible. Typiquement, même en divisant par 2 la mémoire restante, c’est de 10% à 15% que je perds. En prenant 10% de marge sur [size=small][font=Monaco, Consolas, Courier, monospace]opencl_memory_headroom[/font][/size], je ne perds rien en perfs.

Mais quand même si c’était à refaire, je prendrais un CG avec 6GB de RAM pour être tranquille.