Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions examples/color/colormap_normalizations_funcnorm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""
=====================================================================
Examples of normalization using :class:`~matplotlib.colors.FuncNorm`
=====================================================================
This is an example on how to perform a normalization using an arbitrary
function with :class:`~matplotlib.colors.FuncNorm`. A logarithm normalization
and a square root normalization will be use as examples.
"""

import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt

import numpy as np


def main():
fig, axes = plt.subplots(3, 2, gridspec_kw={
'width_ratios': [1, 3.5]}, figsize=plt.figaspect(0.6))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indent is kind of funny here; I'd break before the gridspec_kw, not in the middle of its value, if possible. Also, you can add sharex='col' and this will automatically remove the tick labels in between plots.


# Example of logarithm normalization using FuncNorm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extraneous comment 'cause code explicitly shows this

norm_log = colors.FuncNorm(f='log10', vmin=0.01)
# The same can be achieved with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment should be dropped and there should be a standalone example for that feature

# norm_log = colors.FuncNorm(f=np.log10,
# finv=lambda x: 10.**(x), vmin=0.01)

# Example of root normalization using FuncNorm
norm_sqrt = colors.FuncNorm(f='sqrt', vmin=0.0)
# The same can be achieved with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, presenting users with 3 ways to do things off the bat can be super confusing. Also, is it really necessary to use two examples of the same norm in 1 example? I know you'll likely tell me it's more realistic, but I think examples should fundementally as small/basic as possible while still showing the functionality

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two examples of the same norm in 1 example?

If you're referring to the two Axes, one is the norm function, the other is the actual usage for a colormap; see the figure at the top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not really see a problem of having an example of multiple use in this case (both the log, and the sqrt), but I am happy to change it if everyone thinks that the example image is not appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that presenting multiple ways of doing the same thing (with the extra comment) provides the user an extra insight of what it can be done with the class, at a very low cost. But again, if everyone agrees that the comments are inappropriate, I am also happy to remove them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuLogic I was talking about the log and the sqrt norms and then also providing multiple methods of doing things. @alvarosg is there a colleague you can "hallway test" these docs (and warning messages) with. You have them read the doc/message and just ask what they think (and how they think it could be improved)

# norm_sqrt = colors.FuncNorm(f='root{2}', vmin=0.)
# or with
# norm_sqrt = colors.FuncNorm(f=lambda x: x**0.5,
# finv=lambda x: x**2, vmin=0.0)

normalizations = [(None, 'Regular linear scale'),
(norm_log, 'Log normalization'),
(norm_sqrt, 'Root normalization')]

for i, (norm, title) in enumerate(normalizations):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for i: for ax_row, (norm, title) in zip(axes, normalizations):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great idea :)

X, Y, data = get_data()

# Showing the normalization effect on an image
ax2 = axes[i][1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest either don't bother unpacking, or do ax1, ax2 = axes[i], but really I prefer not unpacking. And if you're gonna use (ax1, ax2) please use them in chronilogical order (ax1 before ax2) even though I get why you're doing them in reverse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [i] will be gone based on my suggestion above, so not-unpacking may look even better after that.

cax = ax2.imshow(data, cmap=cm.afmhot, norm=norm)
ticks = cax.norm.ticks(5) if norm else np.linspace(0, 1, 6)
fig.colorbar(cax, format='%.3g', ticks=ticks, ax=ax2)
ax2.set_title(title)
ax2.axes.get_xaxis().set_ticks([])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ax2.xaxis.set_ticks([])
ax2.yaxis.set_ticks([])

ax2.axes.get_yaxis().set_ticks([])

# Plotting the behaviour of the normalization
ax1 = axes[i][0]
d_values = np.linspace(cax.norm.vmin, cax.norm.vmax, 100)
cm_values = cax.norm(d_values)
ax1.plot(d_values, cm_values)
ax1.set_xlabel('Data values')
ax1.set_ylabel('Colormap values')

plt.show()


def get_data(_cache=[]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the data needs to be produced in the loop if it's just going to be cached. Seems a bit over-engineered for an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, the reason it is like this is because originally each of the plots was done independently inside a function, but I changed it a loop to comply with feedback from @story645, and I forgot to change that, thanks :)

if len(_cache) > 0:
return _cache[0]
x = np.linspace(0, 1, 300)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space between line 66 and line 67 please

y = np.linspace(-1, 1, 90)
X, Y = np.meshgrid(x, y)

data = np.zeros(X.shape)

def gauss2d(x, y, a0, x0, y0, wx, wy):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there a numpy or scipy function that just does this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, in scipy, but sadly matplotlib does not depend on scipy :(

return a0 * np.exp(-(x - x0)**2 / wx**2 - (y - y0)**2 / wy**2)
N = 15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bother with the N, just use directly in the linspace in this example

for x in np.linspace(0., 1, N):
data += gauss2d(X, Y, x, x, 0, 0.25 / N, 0.25)

data = data - data.min()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this isn't inline calculations, why not just `data = (data-data.min()) / data.max()

Copy link
Contributor Author

@alvarosg alvarosg Dec 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that would not normalize the data between to [0,1]. The alternative would be:
data = (data-data.min()) / (data.max()-data.min())
And that is a bit less computationally efficient (Not that it matters much in that case, but it is what I am used to hehehe).

data = data / data.max()
_cache.append((X, Y, data))

return _cache[0]

main()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel like this should be in a main block, and might as well just directly put all the plotting code there instead of shoving it into a function...so

if __name__ == '__main__':
    all the code currently in main()

Copy link
Contributor Author

@alvarosg alvarosg Dec 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did not do the main originally, is because we now the examples generate the figures automatically through some process, and @efiring and I were not sure whether the actual process would run the file as main.

Yeah of course, I guess the only reason to have things in function is so the data generation could be after the rest, but maybe now that is much shortened it will not look that bad right in between were the norms are generated, and where the loop starts.

Copy link
Member

@QuLogic QuLogic Dec 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vast majority of examples (like >90%) use neither main, nor __main__ stuff, though I'm sure some examples use classes derived from backend-specific things that I didn't count properly.

223 changes: 223 additions & 0 deletions lib/matplotlib/colors.py
Original file line number Diff line number Diff line change
Expand Up @@ -960,6 +960,229 @@ def scaled(self):
return (self.vmin is not None and self.vmax is not None)


class FuncNorm(Normalize):
"""
Creates a normalizer using a custom function

The normalizer will be a function mapping the data values into colormap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just start the docstring here, and all norms are functions that map data into colorspace, so you need to be more specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the first line should be a "short" description (in fact, it should be one-line, but we really don't enforce it that often.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to address this in my previous commit. What about saying:

The normalizer will use a provided custom function to map the data values into colormap values in the [0,1] range.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @QuLogic :/ & sounds good @alvarosg

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would start the docstring (which is describing a class) with "A norm based on a monotonic function". Then a blank line, followed by the second sentence of the present init docstring, followed by the remainder of the present init docstring (Parameters, etc.). This is in accord with the numpydoc specification for classes: the init args and kwargs are described in the class docstring, and there is no need for an init docstring at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efiring Thanks for pointing that out, I always thought the init could either be documented in the class, on in the init itself. I will do it the way you suggested.

values in the [0,1] range.
"""

def __init__(self, f, finv=None, **normalize_kw):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be **kwargs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that kwargs is what is used as the general case, however, I used normalize_kw because all of these parameters are to be passed to the parent class Normalize. This is the same name convention used for subplots.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the same; those are dictionaries that are individual arguments. In this case, it's not an argument, it's the placeholder that accepts all other non-explicit keyword arguments.

Copy link
Contributor Author

@alvarosg alvarosg Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are completely right. In that case I am thinking that it may just be better to get vmin, and vmax and clip directly, and pass them explicitly to the parent class. Any downside to doing it like that_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just follow the example of the other classes; LogNorm, BoundaryNorm and NoNorm only accept clip and the rest accept all three explicitly, so being explicit seems to be the best choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually LogNorm does not have a initialization function, so implicitly takes all three of them as well. So yes, I am convinced that it may be better to include them explicitly. It may be worth in that case to put the documentation for vmin, vmax, and clip in common variables so it can be reused across different classes, similarly to what they do here.

"""
Specify the function to be used, and its inverse, as well as other
parameters to be passed to `Normalize`. The normalization will be
calculated as (f(x)-f(vmin))/(f(max)-f(vmin)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max or vmax?


Parameters
----------
f : callable or string
Function to be used for the normalization receiving a single
parameter, compatible with scalar values and ndarrays.
Alternatively a string from the list ['linear', 'quadratic',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just point to the _StrFunctionParser (Alternatively any string supported by _StrFunctionParser) documentation 'cause otherwise this list will have to be updated every time that function is updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that, but it's private; should we be linking to it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is precisely why I did not do it. I do not want to encourage people to have any direct contact _StringFuncParser. They should just see the result, so at least we can still switch to something else in the future if something better than that comes up.

The best way to solve this would probably be to expose the available strings through a public helper function acting as an interface between _StringFuncParser and the values. Something like GetValidStringFuncs. But for now I think it may just be better to do it by hand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuLogic I get, but we sort of are linking to it since it's the underlying engine...dunno

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to put the list of strings and the explanation of "p" in the Notes section. The advantage is that it would set it apart, and keep the Parameters block from being so long. The disadvantage is that it might be separating it too much from its parameter. It's up to you.

'cubic', 'sqrt', 'cbrt','log', 'log10', 'power{a}', 'root{a}',
'log(x+{a})', 'log10(x+{a})'] can be used, replacing 'a' by a
number different than 0 when necessary.
finv : callable, optional
Inverse function of `f` that satisfies finv(f(x))==x. It is
optional in cases where `f` is provided as a string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional when f is a string

normalize_kw : dict, optional
Dict with keywords (`vmin`,`vmax`,`clip`) passed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a dict in this function, but it's just all-other-keyword-args to any caller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @QuLogic that these should be individually documented.

to `matplotlib.colors.Normalize`.

Examples
--------
Creating a logarithmic normalization using the predefined strings:

>>> import matplotlib.colors as colors
>>> norm = colors.FuncNorm(f='log10', vmin=0.01, vmax=2)

Or doing it manually:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or manually


>>> import matplotlib.colors as colors
>>> norm = colors.FuncNorm(f=lambda x: np.log10(x),
... finv=lambda x: 10.**(x),
... vmin=0.01, vmax=2)

"""

if isinstance(f, six.string_types):
func_parser = cbook._StringFuncParser(f)
f = func_parser.function
finv = func_parser.inverse
if not callable(f):
raise ValueError("`f` must be a callable or a string.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f must be a function or a string (I don't like using callable in user facing docs 'cause I think it's a little too dev space)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always the user of a python module will also be a developer, and callable is a keyword of python, so IMO it is more clearer than function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not true at all though in this case. You've got plenty of users for matplotlib in particular who are scientists but not devs who aren't gonna be familiar with any python keyword they don't use all the time (and callable is rarely in that set)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that at least when they are native English speakers they can figure it out quickly enough from the context and the structure of the word itself, "callable" -> "call" "able" -> "something that can be called". The word "string" would be much harder to understand than "callable"--it's pure comp-sci jargon, not used anywhere else in this way, and not something that can be figured out from the word itself. We are not going to delete uses of "string" or "callable".

Copy link
Member

@story645 story645 Dec 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callable is equivalent to function.. You'd still need to mention it was a string. And string is different cause it's used in every single intro python everything, callable isn't. Honestly, callable trips me up all the time and I'm a native English speaker with a CS background.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I dunno I see your point but a) I'm always wary of straight transcriptions of the if statements that triggered the exceptions being the error messages b) I sort of think their should maybe be a bigger discussion of who is matplotlib's expected audience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it callable, because I think is a more accurate term. I think anyone able to use a callable (to pass it to the function) should know the term, and if not should be able to do a 5 s google search. In any case, let´s not waste our energy discussion this, as I think it is pretty irrelevant.

Copy link
Member

@story645 story645 Jan 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with you that this specific thing probably isn't worth fighting about, I feel in a general sense that it's bad practice to dismiss a usability concern as "well they should know what it's called and how to search for it" 'cause rarely are either of those statements true.


if finv is None:
raise ValueError("Inverse function `finv` not provided.")
elif not callable(finv):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick, personal preference: I would use "if" here; the "el" part is not needed, and slightly misleading as to the control flow.

raise ValueError("`finv` must be a callable.")

self._f = f
self._finv = finv

super(FuncNorm, self).__init__(**normalize_kw)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason this is put at the end rather than upfront?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I will move it up


def _update_f(self, vmin, vmax):
# This method is to be used by derived classes in cases where
# the limits vmin and vmax may require changing/updating the
# function depending on vmin/vmax, for example rescaling it
# to accomodate to the new interval.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accommodate

return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "pass", not "return". "pass" is the "do nothing" word.


def __call__(self, value, clip=None):
"""
Normalizes `value` data in the `[vmin, vmax]` interval into
the `[0.0, 1.0]` interval and returns it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backticks should be doubled for inline code styling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I did not realise, I will replace every occurrence of single with double.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single backticks are for cross-references.


Parameters
----------
value : float or ndarray of floats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be a masked array, to handle missing values, or a python sequence, and it doesn't have to be float. So maybe just say "scalar or array-like".

Data to be normalized.
clip : boolean, optional
Whether to clip the data outside the `[vmin, vmax]` limits.
Default `self.clip` from `Normalize` (which defaults to `False`).

Returns
-------
result : masked array of floats
Normalized data to the `[0.0, 1.0]` interval. If clip == False,
values smaller than `vmin` or greater than `vmax` will be clipped
to -0.1 and 1.1 respectively.

"""
if clip is None:
clip = self.clip

result, is_scalar = self.process_value(value)
self.autoscale_None(result)

vmin = float(self.vmin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does self.vmin/self.vmin need to be converted to float? I think there's an import at the top that forces division to always be floating point...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good point, I did not noticed that, I guess there is no need, in that case. Thanks!

vmax = float(self.vmax)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should _check_vmin_vmax do the float conversion and return the two values, so you can write vmin, vmax = self._check_vmin_vmax()?


self._update_f(vmin, vmax)

if clip:
result = np.clip(result, vmin, vmax)
resultnorm = (self._f(result) - self._f(vmin)) / \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use parentheses around the whole expression and remove the backslash.

(self._f(vmax) - self._f(vmin))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to cache any of these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been considering this but the problem is that the cache would depend on vmin, and vmax, and checking whether the cache is up to date, along as having to include new variables for the cache, would make the code much uglier.

I guess it would make sense in cases when evaluating f is expensive, but even in those cases, we would still have to evaluate f(result), which typically will consist on many values. Also, in general, the functions typically used for normalization should not be very expensive to evaluate... (although we should never underestimate the user, hehehe)

else:
resultnorm = result.copy()
mask_over = result > vmax
mask_under = result < vmin
mask = (result >= vmin) * (result <= vmax)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like this 'cause I feel like this is a bit too much on the clever but obfuscating side?
mask = mask_over || mask_under
and then just use ~mask everywhere you're using mask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or mask = ~(mask_over | mask_under) or mask = ~mask_over & ~mask_under?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I am always up for improving the efficiency!

# Since the non linear function is arbitrary and may not be
# defined outside the boundaries, we just set obvious under
# and over values
resultnorm[mask_over] = 1.1
resultnorm[mask_under] = -0.1
resultnorm[mask] = (self._f(result[mask]) - self._f(vmin)) / \
(self._f(vmax) - self._f(vmin))

return np.ma.array(resultnorm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it a MaskedArray? Is that just what other Norms do? It doesn't seem like anything is actually masked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precisely, the parent class returns masked arrays, even though it does not really ever sets the mask to anything. It would make sense to use them for values outside the range, the problem is that in that case there would not be a way to say whether they are above the maximum value, or below the minimum, and the plotting methods need this to use the under and over colours.


def inverse(self, value):
"""
Performs the inverse normalization from the `[0.0, 1.0]` into the
`[vmin, vmax]` interval and returns it.

Parameters
----------
value : float or ndarray of floats
Data in the `[0.0, 1.0]` interval.

Returns
-------
result : float or ndarray of floats
Data before normalization.

"""
vmin = self.vmin
vmax = self.vmax
self._update_f(vmin, vmax)
value = self._finv(
value * (self._f(vmax) - self._f(vmin)) + self._f(vmin))
return value

@staticmethod
def _fun_normalizer(fun):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, this was used by some of the derived classes in the original PR, and I figured this was the best place for all of them to have access as it is a general purpose normalization feature. I will remove it for now, and then when can decide where to include when it is necessary for the first time.

if fun(0.) == 0. and fun(1.) == 1.:
return fun
elif fun(0.) == 0.:
return (lambda x: fun(x) / fun(1.))
else:
return (lambda x: (fun(x) - fun(0.)) / (fun(1.) - fun(0.)))

def autoscale(self, A):
"""
Autoscales the normalization based on the maximum and minimum values
of `A`.

Parameters
----------
A : ndarray or maskedarray
Array used to calculate the maximum and minimum values.

"""
self.vmin = float(np.ma.min(A))
self.vmax = float(np.ma.max(A))

def autoscale_None(self, A):
"""
Autoscales the normalization based on the maximum and minimum values
of `A`, only if the limits were not already set.

Parameters
----------
A : ndarray or maskedarray
Array used to calculate the maximum and minimum values.

"""
if self.vmin is None:
self.vmin = float(np.ma.min(A))
if self.vmax is None:
self.vmax = float(np.ma.max(A))
self.vmin = float(self.vmin)
self.vmax = float(self.vmax)
if self.vmin > self.vmax:
raise ValueError("vmin must be smaller than vmax")

def ticks(self, nticks=13):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the mixing of concerns here, but I'll leave that to @efiring to determine.

Copy link
Contributor Author

@alvarosg alvarosg Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also was not sure of this, because technically vmin, and vmax, do not belong to this class. Actually the only thing my autoscale methods do differently is to convert to float, so maybe I should just make a tiny change to the autoscale methods of Normalize to resemble this:

Methods in Normalize:

    def autoscale(self, A):
        self.vmin = np.ma.min(A)
        self.vmax = np.ma.max(A)

    def autoscale_None(self, A):
        ' autoscale only None-valued vmin or vmax'
        if self.vmin is None and np.size(A) > 0:
            self.vmin = np.ma.min(A)
        if self.vmax is None and np.size(A) > 0:
            self.vmax = np.ma.max(A)

Methods in FuncNorm:

    def autoscale(self, A):
        self.vmin = float(np.ma.min(A))
        self.vmax = float(np.ma.max(A))

    def autoscale_None(self, A):
        if self.vmin is None:
            self.vmin = float(np.ma.min(A))
        if self.vmax is None:
            self.vmax = float(np.ma.max(A))
        self.vmin = float(self.vmin)
        self.vmax = float(self.vmax)
        if self.vmin > self.vmax:
            raise ValueError("vmin must be smaller than vmax")

@efiring would it be ok, to include those changes (casting to float and vmax>vmin check) in Normalize, and remove the methods from FuncNorm?

"""
Returns an automatic list of `nticks` points in the data space
to be used as ticks in the colorbar.

Parameters
----------
nticks : integer, optional
Number of ticks to be returned. Default 13.

Returns
-------
ticks : ndarray
1d array of length `nticks` with the proposed tick locations.

"""
ticks = self.inverse(np.linspace(0, 1, nticks))
finalticks = np.zeros(ticks.shape, dtype=np.bool)
finalticks[0] = True
ticks = FuncNorm._round_ticks(ticks, finalticks)
return ticks

@staticmethod
def _round_ticks(ticks, permanenttick):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as @QuLogic , question for @efiring about mixing concerns. Wonder if all the tick stuff should be in a private class (or public) in ticker and then normalize should just point to the default formatter and locators it should use. (this is an issue I ran headlong into w/ catagorical norming too...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am definitely not very happy with the current with this the way it is...

and then normalize should just point to the default formatter and locators it should use

How does Normalize communicates with the formatters and tickers, is there any good example around?

Copy link
Member

@story645 story645 Dec 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does Normalize communicates with the formatters and tickers, is there any good example around?

It's really messy currently (code is in colorbar.py and would probably require a refactor in the call stack. But that doesn't really matter to you. Since you're having them explictly get ticks via

ticks = cax.norm.ticks(5) if norm else np.linspace(0, 1, 6) 
fig.colorbar(cax, format='%.3g', ticks=ticks, ax=ax_right)

cax.norm.ticks should really likely be it's own tick Locator Method that locates ticks based on some input (I guess convoluted functions). The downside is that it can't rely on the attributes in the norm (unless it's something like FuncNormLocator(norm)), but I think that prevents scope creep in norms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I ideally I would not like the user to have to call ticks manually, but to get those ticks automatically, but I was not sure how to change the default ticker, to maybe implement a FuncTicker class...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easy to modify colorbar to use the ticks method of its Norm, if it exists, and if ticks are not provided by the user. The alternative of having all tick locators and formatters in tickers.py, and having Norms include a method or attributes for default locators and formatters, is also reasonable. I'm going to leave this question open for the moment, but we will need to return to it. I suspect the second of these two approaches will turn out to be the best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would, but the problem is that the colorbar has two different ways to represent the colorbar, depending on spacing:

  • 'uniform': Represent the colorbar uniformly between 0 and 1 and then assign values for the ticks that are not uniform according to the normalization. This is the one I normally use, and the ones that should use the tick values returned by ticks.
  • 'proportional': Stretch/compress the colorbar to represent the non-linearities given by the normalization, so the actual axis in the colorbar is uniform on the data values. In this case selecting the ticks is the same as with any linear axis.

What if I just make a FuncNorm locator class, and then add the corresponding line here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efiring @story645 I have now made a new class FuncLocator (And tests), to include the behavior of proposing tick locations in a more appropriate place.

Instead of taking the norm itself as a parameter of the locator, I decided to just pass a method with the direct and inverse transformation. Then references to the methods of the norm object are passed in the initialization. This way the ticker module does not depend directly on colors, and on FuncNorm in particular, but still, if the FuncNorm instance is modified (limits for example, after calling clim), FuncNorm will adapt the behaviour if his methods and this will be available to the locator.

If you could please take a look, I am sure you can provide useful feedback.

PS: Happy new year :D

ticks = ticks.copy()
for i in range(len(ticks)):
if i == 0 or i == len(ticks) - 1 or permanenttick[i]:
continue
d1 = ticks[i] - ticks[i - 1]
d2 = ticks[i + 1] - ticks[i]
d = min([d1, d2])
order = -np.floor(np.log10(d))
ticks[i] = float(np.round(ticks[i] * 10**order)) / 10**order
return ticks


class LogNorm(Normalize):
"""
Normalize a given value to the 0-1 range on a log scale
Expand Down
41 changes: 41 additions & 0 deletions lib/matplotlib/tests/test_colors.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,47 @@ def test_BoundaryNorm():
assert_true(np.all(bn(vals).mask))


class TestFuncNorm(object):
def test_limits_with_string(self):
norm = mcolors.FuncNorm(f='log10', vmin=0.01, vmax=2.)
assert_array_equal(norm([0.01, 2]), [0, 1.0])

def test_limits_with_lambda(self):
norm = mcolors.FuncNorm(f=lambda x: np.log10(x),
finv=lambda x: 10.**(x),
vmin=0.01, vmax=2.)
assert_array_equal(norm([0.01, 2]), [0, 1.0])

def test_limits_without_vmin_vmax(self):
norm = mcolors.FuncNorm(f='log10')
assert_array_equal(norm([0.01, 2]), [0, 1.0])

def test_limits_without_vmin(self):
norm = mcolors.FuncNorm(f='log10', vmax=2.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same vmax you would get if you didn't set it, so I guess it doesn't really test that it's working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that is a test on itself :P
You are right though, I will include tests where the values go above and below vmin, vmax, with and without the clip option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added added test_clip_true, test_clip_false, test_clip__default_false to test the clipping behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those changes should pretty much address the original comment

assert_array_equal(norm([0.01, 2]), [0, 1.0])

def test_limits_without_vmax(self):
norm = mcolors.FuncNorm(f='log10', vmin=0.01)
assert_array_equal(norm([0.01, 2]), [0, 1.0])

def test_intermediate_values(self):
norm = mcolors.FuncNorm(f='log10')
assert_array_almost_equal(norm([0.01, 0.5, 2]),
[0, 0.73835195870437, 1.0])

def test_inverse(self):
norm = mcolors.FuncNorm(f='log10', vmin=0.01, vmax=2.)
x = np.linspace(0.01, 2, 10)
assert_array_almost_equal(x, norm.inverse(norm(x)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for scalar values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now added

def test_ticks(self):
norm = mcolors.FuncNorm(f='log10', vmin=0.01, vmax=2.)
expected = [0.01, 0.016, 0.024, 0.04, 0.06,
0.09, 0.14, 0.22, 0.3, 0.5,
0.8, 1.3, 2.]
assert_array_almost_equal(norm.ticks(), expected)


def test_LogNorm():
"""
LogNorm ignored clip, now it has the same
Expand Down