top of page

Analysis of Watermark Removal Models

Watermark Removal using Deep Image Priors


The integration of watermarks into digital images has become a common practice for various purposes, ranging from copyright protection to branding. However, there are instances when the need arises to remove these watermarks for different reasons. This article delves into groundbreaking approaches for watermark removal in digital images. Beginning with the unconventional “Deep Image Prior” method, we challenge conventional wisdom by revealing how the architecture of a neural network can capture image features, enabling watermark removal without extensive training. Next, the “WDNet” technique offers a two-stage generator architecture that not only removes but dissects watermarks, enhancing removal accuracy. We explore “cGANs” which ensure minimal residual watermark presence while maintaining photo-realism through unique loss functions and discriminators. Another innovative “self-calibrated localization and background refinement” method addresses incomplete watermark detection and degraded background quality, elevating texture enhancements. Real-world applications, like hardware product images, showcase the versatility of these techniques.


Watermark Removal using Deep Image Priors


An examination of the paper Deep Image Prior reveals a novel approach that challenges the prevailing notion attributing the acquisition of realistic image priors by convolutional networks to an extensive array of training instances. The study demonstrates that the inherent architecture of the generator network itself possesses the capacity to comprehensively capture an image's foundational elements, encompassing low-level features and statistical attributes, prior to undergoing any form of training.


In essence, the research employs a neural network that is randomly initialized as a meticulously crafted prior, adeptly harnessed to address conventional inverse problems, including denoising, super-resolution, and image inpainting tasks.


The crux of the approach involves the utilization of an untrained convolutional network, honed exclusively on a singular degraded image, in contrast to the conventional practice of exposure to numerous instances. The resultant network weights consequently function as a representative manifestation of a restored image. Notably, no constituent facet of the network is derived from the amassed data corpus.


While the scope of this paper is primarily directed toward image inpainting tasks, it is imperative to underscore that its applicability extends beyond the realm of watermark removal. The repository elucidates the methodology through an illustrative process whereby the watermark-laden area is manually enveloped in blank ink, achieved through software tools like MS Paint. This meticulous procedure leads to the designation of the corresponding blackened pixels with a numerical value of zero.


Consequently, the convolutional model assimilates this scenario as an image inpainting endeavour, thereby assuming the responsibility of extrapolating values for the nullified pixels. However, it is paramount to acknowledge that this approach mandates a labour-intensive and time-consuming process, entailing manual intervention to demarcate and encapsulate the watermark region with ink.


WDNet: Watermark-Decomposition Network for Visible Watermark Removal

This paper is very specific to our task of watermark removal. It tries to combat all the uncertainties that exist with watermarks such as size, shape, colour and transparency which are hardly dealt with in literature.


The way it does this is by having a two-stage generator architecture. The 1st stage deals with a rough decomposition of the watermarks from the whole watermarked image. This mainly focuses on watermark region localisation, which is ignored in most methods, as they have only a one-step process going from watermarked image to a watermark-free image, basically an image-to-image translation task.


The WDnet effectively not only removes the watermark but also separates it during the process, which is then used for pixel-wise refinement for enhanced removal in the second stage. In the first step, the rough area of the watermark is estimated using a U-Net as a backbone Network and predicts the rough watermark and the transparencies. The second stage is a reasonably small network with a few residual blocks.


Since the watermarks are also effectively separated from the image, they can be used for further fine-tuning by creating a larger dataset by image augmentation with the watermarks. This paper also produces a new dataset for watermarks with all possibilities not restricted to only grey-scale watermarks.


An example of the model in action is below:

X- image with a watermark, W-watermark extracted, alpha- transparency, Y-watermark free image


Towards Photo-Realistic Visible Watermark Removal with Conditional Generative Adversarial Networks

In this paper, the watermark eradication model is meticulously crafted upon the foundational framework of conditional Generative Adversarial Networks (cGANs). This intricate network structure comprises two distinct yet interdependent components: a generator and a discriminator. This synergistic dyad of network elements operates in tandem, diligently endeavouring to achieve a dual objective. The first aspect pertains to the meticulous minimization of any lingering traces of the watermark within the restored image. Simultaneously, the network prudently navigates the complex terrain of maintaining an authentic and photo-realistic semblance, ensuring the preservation of the image's intrinsic low-level features.


Central to the efficacy of this proposition is the formulation of a pioneering loss function, a composite amalgamation of an adversarial loss and a pixel-wise content loss. Augmenting this foundation is a discriminative apparatus that adopts a patch-based architecture, meticulously conditioned upon the input watermarked image. This discriminator, through meticulous training, harnesses its acumen to differentially discern revitalized images vis-à-vis their pristine watermark-free counterparts. Facilitating the generator's operations is an architectural embodiment reminiscent of a U-net framework, elegantly employed to usher the watermark removal endeavor.


It is worth acknowledging, however, that this model does encounter a caveat, as evidenced in the exploration thus far. Predominantly calibrated to transparent and grayscale watermark instances, the model's efficacy in addressing a broader spectrum of watermark types, notably coloured variants and other manifestations, remains a realm ripe for further exploration and empirical validation.


In a parallel vein, it is of paramount significance to acknowledge the absence of a readily accessible GitHub repository pertaining to the practical implementation of this innovation. Should such a repository prove elusive, prudent consideration ought to be devoted to delineating the appropriate course of action in this regard, thus ensuring the comprehensive dissemination and accessibility of this pioneering methodology.


WDnet inference on sample images given:

Example Image 1:

After removal


Example Image 2:

After removal


Example Image 3:

After removal


Example Image 4:

After removal


Example Image 5:

After removal


Example Image 6:

After removal


Example Images 7:

After removal


Example 8:

After removal



Visible Watermark Removal via Self-calibrated Localization and Background Refinement

Since existing approaches suffer from incomplete detected watermarks and degraded texture quality of restored background, the authors design a twostage multi-task network to address the above issues in this paper. The coarse stage consists of a watermark branch and a background branch, in which the watermark branch self-calibrates the roughly estimated mask and passes the calibrated mask to the background branch to reconstruct the watermarked area. In the refinement stage, they integrate multi-level features to improve the texture quality of the watermarked area.





When the previous image is passed again to the model








Results after product extraction

Example 1:





Example 2:


No object was detected by the model
















Example 3:



Example 4:





Example 5:









Example 6:








Example 7:



Example 8:





Boundary Aware Salient Object Detection

Inference of the model after extracting the product from the output of the SLBR model:










Result of super-resolution after watermark removal:











Final Results from Pipeline with Text Detection:










Background Removal Inference on Misumi Images:


Example 1: Cross Recessed Pan Head Tapping Screw, Type 1, A Shape with Various Coatings【1–12,000 Pieces Per Package】



Example 2: Cross Recessed Pan Head Tapping Screws, 2 Models B-0 Shape with Various Coatings【1–12,000 Pieces Per Package】