Archive for the ‘maths’ Category


Wednesday, February 10th, 2016

For years now, I've been seeing articles about the TPP and TTIP treaties showing up on the various link aggregators I read. Without exception these articles warn of all the terrible effects that will come to pass if these treaties are passed, in areas as diverse as copyright, consumer safety, democracy and the environment,

The general scheme seems to be this: instead of engaging in public debate to determine what the laws should be in order to properly balance corporate profits against the interests of individuals, negotiators representing various industries meet in secret to draw up a set of laws which are then presented to signatory countries to ratify as a fait accompli. There may be a lot of very positive changes in such a treaty, but separating these from the negative ones will be impossible, and as long as the positive changes outweigh the negative ones the treaty will get ratified and we'll all be subject to the result.

Why can't we all just agree to pass the good parts of the treaty and throw out the bad parts? The best explanation for this I've heard was from this episode of NPR's Planet Money podcast. Basically, different parties want different things so they will trade off against each other - "I'll improve the situation for your steel industry in my country if you improve the situation for my textiles industry in yours". Essentially, it's a set of barters, but what's being bartered is laws instead of actual goods and services.

We don't normally barter for things in everyday life - we use money instead, buying and selling. So why would we barter in our trade treaties? If these barters are mutually beneficial, why could they not be structured as a set of exchanges for money: "You pay me X to improve the situation for your steel industry in my country" and "I'll pay you Y to improve the situation for my textiles industry in your country". This should be much quicker and easier to negotiate since only one thing needs to be negotiated at a time - the value to country A of changing the laws of country B.

I think that a big part of the reason it doesn't happen like this is because that would be an explicit giveaway of public money for the benefit of a particular private industry, and therefore a bit of a hard-sell, politically speaking (which is too bad because such a series of giveaways would still leave us better off than TPP/TTIP). The cost could be offset by raising a tax on the industry in question but if the law change that that industry is trying to obtain is the elimination of a tariff then it's a wash - instead of paying country B directly with a tariff, that industry would be spending the same money in taxes to country A which then pays them to country B in the "law for money exchange".

Apart from eliminating tariffs, another major reason for these treaties seems to be harmonization of regulations. If you want to build a new product, it will have to pass certain tests before you're allowed to sell it. For a piece of consumer electronics, examples might be electrical safety tests and electromagnetic emissions tests. Now, because the laws requiring these standards are passed on a country-by-country basis, the regulations may differ from country to country. There is a thriving industry of test centers which will test your product to make sure it complies with applicable standards in all the countries where you want to sell it. In many cases, the tests don't even have to be repeated - the device just has to pass the test at the strictest setting and then all is good.

However, some unfortunate (though quite deliberate) provisions of the TPP and TTIP seek to "harmonize" these regulations by forcing them all to the lowest standard rather than the highest. Any country wishing to improve standards could be sued for it. There may be some regulations that are impossible to satisfy in two particular countries at once (meaning the companies would have to make different products for different markets, making them much more expensive than they would otherwise be) and I can see that such kinds of harmonization can be desirable for everyone (as long as they aren't weakened in the process). But again, these things can happen at the national level - no treaties need to be involved.

So essentially these trade treaties seem to be a way to take a set of massively regressive laws written by big corporations and package them up into something that can be sold to legislators as "you must pass this or our country will be left out of all this lucrative trade and the economic consequences will be disastrous".

A new protected class: things you've said

Saturday, October 27th, 2012

Some years ago, when this site was much smaller than it was today, it was suggested to me that I might want to be careful about what I wrote here lest it get read by a prospective employer who might find a reason in it to decline me employment. Having my website be nothing more than a online resume would be very boring, though, so I declined - in rather more polite terms than I really felt. Besides which, any employer who would do such a thing would clearly not be a good employer to work for. I'm lucky in that I have a pretty desirable skillset, though - not everyone is so fortunate.

I bring this up now because of this horrifying story that I read this morning. The very suggestion that such a thing might be done will have a massively chilling effect on participation in publicly archived discussions. Blogging is already hard enough knowing that everything I say is really part of my permanent record without imagining that it will be data-mined to discover all sorts of things about me that I didn't want to share in the first place!

We talk a lot about free speech in the western world, and take very seriously any possibility that government might limit that speech. But I think we don't take seriously enough threats to our free speech from public sector. Knowing that we can't get arrested for stuff we say online isn't terribly useful if that same stuff can make us unemployable.

So, I'd like to see some kind of legal framework that would prevent employers from discriminating against prospective hires based on things they've said. Such a framework wouldn't be completely unprecedented - there are already several pieces of information that are technically available to employers which they can't use in employment decisions. I propose that we just expand that to make "stuff you've said" a protected class. Naturally, that would also make it illegal to fire someone over something that they said (though exceptions would probably have to be made for things directly related to their job - it should still be possible to fire someone for violating an NDA, for example).

Companies don't like to have employees who say terrible things on the internet, because it reflects badly on them (and their hiring practices). But it only does so because they have the power to do something about employees who say terrible things on the internet. If they didn't have that power, they can just say "it's not work related - it's nothing to do with us". Essentially, because it's not prohibited it's essentially compulsory. So companies ought to be clamouring for this legislation - it would ensure they could concentrate on their core business and not have to go googling for dirt on their employees. It would also mean that they could choose the best person for the job without having to take into account stuff that fundamentally doesn't matter to them. And it would make it less likely that they would be left short-handed due to an ill-advised comment.

Practical considerations of a consumption tax

Friday, October 26th, 2012

I enjoy listening to NPR's Planet Money podcasts and I have learned a lot about economics from them.

One recent show that was particularly interesting and that has had me thinking is the one about the no brainer economic platform. That has made me reconsider several of my own economic opinions and throw several sacred cows on the barbecue (to mix up some metaphors).

In particular, I had always thought that income taxes were the best kind of taxes, being progressive and cheap to collect. However, as Planet Money points out - taxing something discourages that behavior and we don't want to discourage income and employment! A consumption tax makes much more sense in terms of incentives - discouraging consumption is environmentally friendly and, while it isn't naturally the most progressive form of taxation, can be tweaked to make it reasonably progressive. How to make it practical to collect is a different matter, though. Taxing something is always extremely invasive, as the government will require lots of information and documentation about that thing in order to make sure that you're not cheating on your taxes.

A carbon emission tax seems like one of the best kinds of consumption tax. We're pretty universally agreed that putting carbon dioxide in the atmosphere is an undesirable thing. Most of that net carbon comes from fossil fuels which tend to come from a few big mining operations, so it's natural and easy to tax these operations per tonne of carbon they sell. They will then pass these costs on to their customers and ultimately to the consumer, discouraging the consumption by means of increased fuel costs (of course, there are political problems with raising fuel prices but that's a bit out of the scope of this discussion - after all, Planet Money does point out that their economic platform would be political suicide!)

Fossil fuels aren't the only way carbon gets into the atmosphere, though. Should we also charge dairy farmers for the methane emissions from cows? That has a big effect on climate change too, although the carbon involved came out of the atmosphere much more recently than that from fossil fuels. The fairer we try to be, the more invasive the tax becomes. For example, consider someone who lives completely off the grid, growing trees and felling them for firewood. He's completely self-sufficient in terms of carbon cycles (while burning the trees puts out carbon dioxide, it's completely cancelled by growing them in the first place). So it doesn't seem like he should be paying any carbon tax. So we'd have to offset the emission tax against carbon absorption - in other words pay a credit to individuals and organizations causing a net reduction in the amount of carbon in the atmosphere. Growing trees is one such activity. What about landfill site operators, who sequester enormous amounts of carbon in the ground? That's not generally considered to be a hugely "green" activity, but by this measure it would be.

Of course, as we inevitably move away from fossil fuels, that form of tax becomes increasingly useless. We can tax other forms of "digging up stuff from the ground and using it up" consumption, but I suspect that getting all the tax money we need that way would have some really undesirable consequences (like pricing a lot of useful things like electronics right out of the range of what most people can afford).

What other things are consumed that we could tax? Well, there's a lot of energy falling on the earth in terms of sunlight that we can collect and reuse in various forms (and indeed I expect that's where much of our energy will come from in a few decades). But it's hard to think of that as "consumption" when it's so plentiful and so much of it just goes to waste. Harnessing solar energy is also something we don't really have any interest in discouraging.

There is one resource that we each have a finite amount of and have to be very careful how we use it. That is our time. Could we tax consumption of time? That sounds like a very regressive poll tax on the face of it. But what if we tax only wasted time? I subscribe to the view that time is only wasted if you're spending it doing something that you don't truly enjoy. There's an easy way to tell (from a taxation point of view) if you're doing something that you don't really enjoy - somebody has to pay you to do it. If you do it without being paid, then you're probably doing it because you want to and therefore not wasting your time.

So that brings us right back around to the income tax again. There's even a justification here for making the income tax progressive - if you need to be paid more to do some piece of work, obviously it's more unpleasant and therefore should be taxed at a higher rate (yes, I know that people don't really get paid in proportion to the unpleasantness of their jobs but it could be a useful legal fiction).

There's one aspect to income taxes that I don't really like, which is casual labour. Suppose I wanted to hire the kid next door to trim my hedges - if I had to fill in a big pile of tax paperwork in order to do so I probably wouldn't bother - I'd just do it myself instead. Lots of "under the table" work goes on and many blind eyes are turned to it which is a sad state of affairs - our laws ought to reflect actual practice rather than making huge swathes of our population (technically) outlaws. How do we draw the line between which jobs should be taxed and which shouldn't?

I haven't thought through all the ramifications of this, but I have an idea that perhaps income tax ought be linked to limited liability. We give this great gift to corporations, limiting the liability of investors to only the amount of money they invest - if the corporation does something which costs society more than that, society absorbs the remainder of that cost. The idea is that by doing this we encourage entrepreneurship, which is all very well but it seems like there should be a cost to limiting liability, so perhaps income tax should only be raised when the employer is a limited liability organization. The deal ends up being the same as it currently is for such organizations, but if you don't need to limit your liability, then your employees don't need to pay income tax either - I think it's quite a neat way to delineate. On the small business end of things, there will be a population of companies with limited liability and a population without, they will compete with each other and market forces will determine the size of business at which limiting liability becomes worth it. Perhaps there could be some kind of sliding scale of liability (and therefore taxation) to ease the transition for companies growing across that boundary. I suspect I don't have the economic chops to have a good sense of how well that would work out in the real world though.

Multifunction gates

Friday, September 28th, 2012

Recently, I came across an interesting article about unusual electronic components. One of the components that article talks about is the multifunction gate, which is a 6-pin integrated circuit which can act as one of several different 2-input logic gates depending on which input pins are connected to which incoming signal lines, and whether the ones that aren't connected are pulled high or low.

However, the gates mentioned by that article can't act as any 2-input logic gate, only some of them. The 74LVC1G97 can't act as NAND or NOR, the 74LVC1G98 can't act as AND or OR, and neither of the devices can act as XOR or XNOR. Their truth tables are as follows:

Inputs 74LVC1G97 74LVC1G98
0 0 0 0 1
0 0 1 0 1
0 1 0 1 0
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 0 1
1 1 1 1 0

That got me wondering if it's possible for such a 6-pin device to act as any 2-input logic gate. Such a 6-pin device is naturally equivalent to a single 3-input logic gate, since two of the pins are needed for power. There are only 256 (28) possible 3-input logic gates (since the truth table for a 3-input logic gate contains 23=8 bits of information). So it's a simple matter to just write a program that enumerates all the possible 3-input logic gates, tries all the possible ways of connecting them up, and sees what the results are equivalent to.

There are 16 possible 2-input logic gates: 0, 1, A, B, ~A, ~B, A&B, A|B, ~(A&B), ~(A|B), A~B, ~(A~B), A&(~B), (~A)&B, A|(~B) and (~A)|B. Discounting the trivial gates, the gates that ignore one of their inputs and treating two gates as identical if we can get one from the other by swapping the inputs, gives us 8 distinct gates: AND, OR, NAND, NOR, XOR, XNOR, AND-with-one-inverting-input and OR-with-one-inverting-input.

There are 4 possibilities for connecting up each of the three input pins: low, high, incoming signal line A and incoming signal line B. I originally thought that connecting the output pin to one of the input pins might also be useful, but it turns out that it isn't - either the value of the input pin makes no difference (in which case it might just as well be connected to one of the other four possibilities) or it does. If it does, then either the output is the same as the input pin that it's connected to (in which case it's underconstrained, and not a function of the other two input pins), or the opposite (in which case it's overconstrained, and also not a function of the other two input pins).

So we have 256 possible 3-input logic gates times four possibilities for each of the three input pins, times two possible states for each of the two incoming signal line - that's 65536 circuit evaluations to try, which a computer program can run through in a timespan that is indistinguishable from instantaneous by unaided human senses.

Running the program didn't find any 3-input gates which can be configured to act as any of the 16 2-input gates, but it find something quite interesting - a dozen 3-input gates which can be configured to make 14 of them. Since swapping the inputs around gives equivalent gates, there's actually just two different gates, which for the purposes of this essay I'll call Harut and Marut.

Inputs Harut Marut
0 0 0 0 1
0 0 1 0 1
0 1 0 1 0
0 1 1 1 0
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 0

Note that the truth tables for Harut and Marut are quite similar to 74LVC1G97 and 74LVC1G98 respectively, just with two of the rows swapped or inverted. I haven't been able to find Harut and Marut gates on Mouser - it would be interesting to know if anybody makes them. One possible downside is that the 3-input gate isn't as useful in its own right (the 3-input gates for 74LVC1G97 and 74LVC1G98 are just a 2-input multiplexer and same with inverted output respectively).

I think it should be possible to design a 6-pin device that can yield any 2-input gate by making the supply lines take part in the configuration as well. For example, if you have a Harut gate and a Marut gate which give high-impedence outputs when power is not applied, you could put diodes on their supply lines and connect them up in parallel with the supply lines interchanged. Then applying power in the normal way would yield a Harut gate and reversing the polarity would yield a Marut gate. There's probably better ways to do it.

Buying up the junk

Wednesday, September 26th, 2012

I recently moved from Seattle to the UK. Part of the process of doing a big move like that is getting rid of stuff that you don't need any more. For some things that's just a question of throwing it away (we had one visit from the haulers, two trips to the tip and several weeks of extra garbage charges). But there's other stuff that, while we didn't need it any more, would have some value to someone else. Getting rid of all that stuff was a consistent feature of all our todo lists, and we still didn't manage to get rid of as much stuff as we wanted. We gave some things away to friends, sold some things on a neighborhood email list and had a (not particularly successful) garage sale.

There really ought to be a better way than this. There must be a enormous amount of value locked up in peoples houses in the form of stuff that they don't use but which still has some value and therefore they don't want want to throw away, but getting rid of it is an annoying, difficult, low priority task, so never gets done.

I think somebody could make a fortune by setting up a company that takes away your unwanted stuff. You'd request a visit from them on their website (or maybe they could just show up on a regular basis) and take away anything that you didn't want. They'd do the work of valuing it, selling it and shipping it, and then send you the proceeds (after taking their cut). If, after the valuation stage, you decided that the item was worth more than that you could reject the offer and they'd bring it back with the next visit (perhaps for a small charge to avoid the service being abused as a free valuation service). Items that might leak liquids or emit odors would probably not be accepted (the small amount of value held in such items would probably not be worth the possible damage to other items).

They'd do all the work of making sure that items were packed sufficiently well for shipping, reusing packing materials as much as possible (and eliminating a large amount of waste). If they delivered items as well (instead of relying on UPS, Fedex or similar) they could take away the packing materials on delivery, helping the environment and saving the customer from another annoying job (my workshop in Seattle often used to get cluttered up with old cardboard boxes, packing peanuts and bubble wrap).

Another nice thing about this business is that it would be really easy to bootstrap - you could start it off in just one city with a couple of people, a van and a simple website and some insurance against breaking things. Deliveries to places too far away for the van to get to (or between two different cities where the company does have vans) could be done with the existing delivery services. After visiting one house for a requested pickup they could visit other nearby houses and ask if they have any items they want to get rid of.

eBay private bids

Tuesday, September 25th, 2012

There are all sorts of programs available for "eBay sniping" - automatically placing your bid in the last moments of the auction in order to minimize the amount of information available to your adversaries (the other bidders) and thereby maximize your chances of winning while minimizing the amonunt you expect to pay.

The trouble with this is that it creates two classes of eBay bidders - those who have paid extra (in money, effort or both) for the sniping software, and those who haven't. This makes the eBay user experience less friendly and more frustrating.

So I think eBay should offer its own (free) sniping software built right into the site - give bidders the opportunity to make a public or a private bid (or both). Only the highest public bid is shown but the highest bid overall (public or private) will actually win.

Why would anyone make a public bid if a private one is available? Wouldn't this turn all eBay auctions into silent (?) auctions? Not necessarily - there are some circumstances when a public bid is actually in the bidder's favour - for example if there are several auctions for equivalent items all ending around the same time, making a public bid on one of them is likely to push other bidders towards the other auctions.

Though that bit of game-theoretic oddness could also be eliminated with a closely feature closely related to private bids, which is the ability to (automatically) withdraw a private bid. This would allow one to bid on several auctions at once, while guaranteeing that you'll only win one of them. More complicated logic would also be possible, like "I want to buy either A or the combination of (B and any of C, D or E)". I'm not sure if this is currently possible with sniping software (I haven't used it). One could also set different bids for different auctions, if some are more desirable than others in some ways.

All these changes favor buyers rather than sellers, so eBay users who are primarily sellers probably wouldn't like them (after all, they help buyers save money!) But sellers already hate eBay - many of their other policies are drastically biased towards buyers. The only reason that sellers keep selling stuff on eBay is that is where the buyers are (and therefore that is where the best prices are, even after factoring out eBay's cut).

One other reason that eBay might want to do this would be that by having private bids go through the site, they get more accurate information about who is prepared to pay how much for what. I don't know if eBay currently does anything with this sort of information, but it surely must have some value to someone.

Fractional exponent floating point numbers

Thursday, September 20th, 2012

The usual binary representation of floating point numbers is a sign/exponent/mantissa representation. To get an idea of what this means, consider a hypothetical 6-bit floating point format with 1 sign bit, 3 bits of exponent and 2 bits of mantissa. That gives us 64 possibilities, which is sufficiently few that it's practical to list all the possible values:

0 000 00 +0   1 000 00 -0
0 000 01 +0.0625   1 000 01 -0.0625
0 000 10 +0.125   1 000 10 -0.125
0 000 11 +0.1875   1 000 11 -0.1875
0 001 00 +0.25   1 001 00 -0.25
0 001 01 +0.3125   1 001 01 -0.3125
0 001 10 +0.375   1 001 10 -0.375
0 001 11 +0.4375   1 001 11 -0.4375
0 010 00 +0.5   1 010 00 -0.5
0 010 01 +0.625   1 010 01 -0.625
0 010 10 +0.75   1 010 10 -0.75
0 010 11 +0.875   1 010 11 -0.875
0 011 00 +1   1 011 00 -1
0 011 01 +1.25   1 011 01 -1.25
0 011 10 +1.5   1 011 10 -1.5
0 011 11 +1.75   1 011 11 -1.75
0 100 00 +2   1 100 00 -2
0 100 01 +2.5   1 100 01 -2.5
0 100 10 +3   1 100 10 -3
0 100 11 +3.5   1 100 11 -3.5
0 101 00 +4   1 101 00 -4
0 101 01 +5   1 101 01 -5
0 101 10 +6   1 101 10 -6
0 101 11 +7   1 101 11 -7
0 110 00 +8   1 110 00 -8
0 110 01 +10   1 110 01 -10
0 110 10 +12   1 110 10 -12
0 110 11 +14   1 110 11 -14
0 111 00 +Infinity   1 111 00 -Infinity
0 111 01 SNaN   1 111 01 SNaN
0 111 10 QNaN   1 111 10 QNaN
0 111 11 QNaN   1 111 11 QNaN

So we can represent numbers up to (but not including) 16, in steps of (at finest) 1/16.

The distribution of these numbers (on the positive side) looks like this:

As you can see it approximates an exponential curve but is piecewise linear. Despite there being 64 distinct bit patterns, only 55 real numbers are represented because of the infinities, Not-a-Numbers and the two zeroes. Near zero, the pattern is broken and the numbers between 0 and 0.25 have the same spacing as the numbers between 0.25 and 0.5 - these are called the "denormals" and cause enormous headaches for CPU manufacturers, OS developers, toolchain developers and developers of numerical software, since they often either don't work according to the standards or are much slower than the rest of the floating point numbers.

With IEEE-754 single precision floating point numbers, there are so many pieces to the piecewise linear curve (254 of them for each sign) that the piecewise linearity wouldn't be visible on a graph like this, and the proportion of the bit patterns corresponding to distinct real numbers is much higher (close to 255/256 rather than 7/8).

How different would the world be if we just used an exponential curve instead of a piecewise linear approximation of one? That is to say, what if instead of representing floating point number x as A*2B for some integers A and B, we represented it as 2A/k for some integer A and a constant integer k? In other words, we represent floating point numbers as the exponent of a fixed point fraction (hence, the name Fractional Exponent Floating Point or FEFP). Let's choose k such that the powers of two are the same as in the exponent/mantissa representation, i.e. k=4 in this case, and again list all the possibilities:

0 000 00 +0   1 000 00 -0
0 000 01 +0.1487   1 000 01 -0.1487
0 000 10 +0.1768   1 000 10 -0.1768
0 000 11 +0.2102   1 000 11 -0.2102
0 001 00 +0.25   1 001 00 -0.25
0 001 01 +0.2973   1 001 01 -0.2973
0 001 10 +0.3536   1 001 10 -0.3536
0 001 11 +0.4204   1 001 11 -0.4204
0 010 00 +0.5   1 010 00 -0.5
0 010 01 +0.5946   1 010 01 -0.5946
0 010 10 +0.7071   1 010 10 -0.7071
0 010 11 +0.8409   1 010 11 -0.8409
0 011 00 +1   1 011 00 -1
0 011 01 +1.1892   1 011 01 -1.1892
0 011 10 +1.4142   1 011 10 -1.4142
0 011 11 +1.6818   1 011 11 -1.6818
0 100 00 +2   1 100 00 -2
0 100 01 +2.3784   1 100 01 -2.3784
0 100 10 +2.8284   1 100 10 -2.8284
0 100 11 +3.3636   1 100 11 -3.3636
0 101 00 +4   1 101 00 -4
0 101 01 +4.7568   1 101 01 -4.7568
0 101 10 +5.6569   1 101 10 -5.6569
0 101 11 +6.7272   1 101 11 -6.7272
0 110 00 +8   1 110 00 -8
0 110 01 +9.5137   1 110 01 -9.5137
0 110 10 +11.3137   1 110 10 -11.3137
0 110 11 +13.4543   1 110 11 -13.4543
0 111 00 +16   1 111 00 -16
0 111 01 +SNaN   1 111 01 -SNaN
0 111 10 +QNaN   1 111 10 -QNaN
0 111 11 +Infinity   1 111 11 -Infinity

I moved the infinities to have "all 1s" representation, and just reserved 4 extra entries for NaNs (I'm not sure if anyone actually uses the facility of NaNs to store extra data, but a nice thing about this representation is that you can dedicate as few or many bit patterns to NaNs as you like).

A couple of improvements: our "minimum interval" between adjacent numbers is now about 1/36 instead of 1/16 and (because of the NaN tweak) we can represent slightly higher numbers.

In this representation, there's only one "denormal" (zero) for each sign. That can't fit the pattern however we cut it, since our "A" value never reaches minus infinity, however I suspect people would complain if they couldn't put zero in their floating point variables. This does mean that the interval between zero and the next highest number is larger than the minimum interval.

Would it be practical for CPUs to actually do computations with these numbers? Surprisingly, yes - though the trade-offs are different. Multiplication and division of FEFP numbers is just addition and subtraction (of the fixed point fractional exponent). Addition and subtraction are more complicated - you need to actually take the exponent of the representations, add or subtract them and then compute the logarithm to get the representation of the result. That sounds really hard, but using the CORDIC algorithms and some of the trickery currently used to add and subtract floating point numbers, I think additions and subtractions could be done in about the same time and using roughly the same number of transistors as multiplications and divisions of floating point numbers currently use. What's more, it might be possible to reuse some of the add/subtract logic to actually compute exponents and logarithms (which are themselves quite useful and often implemented in hardware anyway).

So I actually think that FEFP is superior to IEEE754 in several ways. The latter is sufficiently entrenched that I doubt that it could ever be replaced completely, but FEFP might find some use in special-purpose hardware applications where IEEE754 compatibility doesn't matter, such as signal processing or graphics acceleration.

Spam for buying instead of selling

Wednesday, September 19th, 2012

Most unsolicited commercial email tries to get its readers to buy something, but recently I had some spam that was the other way around - they wanted to give me money! Specifically, they wanted to buy ad space on one of my sites. Now, given that I don't have any costs involved in running the site in question (the hosting and domain name were donated), I don't feel I have the right to put ads on the site. I wouldn't want to anyway - it's a better site for not having ads, and it's not like the amount of money involved is likely to make the slightest bit of difference to the quality of my life. For the same reasons I don't have ads on this site (maybe if I could make enough money from the ads to give up my day job it would be a different story but none of my websites has anywhere near that amount of traffic!)

Even so, it took me a moment to decide whether to reply with a polite "no thank you" or to press the "report as spam" button. In the end I decided on the latter course of action - it's still unsolicited commercial email even if it's buying instead of selling. And one of the last things the internet needs is more advertising. It's kind of interesting though that advertisers (or ad brokers at least) are starting to court webmasters - it always seemed to me that the supply of advertising space on the internet vastly outstripped the demand, but maybe that's changing.

Similarly, my web host recently sent me a coupon for $100 of free Google ads - I have no use for them myself (I don't think anyone interested in what I have to say here is going to find it via an ad) but I hope I'll be able to donate them to someone who does have a use for them.

Removing outliers

Monday, October 17th, 2011

If you've got a sequence of data that you want to do some statistical analysis on, but you know that some of it is bad, how do you remove the bad data? You could just remove the top 5% and bottom 5% of values, for example, but maybe you don't have any bad data (or maybe you have lots) and that would adverse affect your measurement of the standard deviation.

Suppose that you know that your dataset is supposed to follow a normal distribution (lots do). Then you could remove outliers by measuring the skew and kurtosis (3rd and 4th moments), and just repeatedly remove the sample furthest from the mean until these measurements look correct. This algorithm is guaranteed to terminate since if you only have 2 samples the skew and kurtosis will be 0. You've still got a parameter or two to tune though (how much skew and kurtosis you'll tolerate).

Import taxes to level the playing field

Thursday, October 6th, 2011

There's a big hurdle for the US manufacturing sector in that many products are much cheaper to produce in China or Taiwan. Part of the reason for this is that those countries have much less strict standards on things like pay, employee working conditions and pollution controls. Another way of looking at it is that US companies are saving money and skirting important social and environmental rules by outsourcing their manufacturing to countries which don't have these rules - with excellent consequences in terms of increased profits and cheaper products, but disastrous consequences in terms of the environment, the welfare of those who make the products, and US manufacturing jobs.

It seems to me that the US government should use its power to tax imports to level the playing field for US manufacturers by removing the incentive to manufacture elsewhere. In other words, the import tax on something should be the difference between what it costs to make something in the US and what it actually cost to make due to laxer regulations. Then, manufacturing for things sold in the US would (over time) move to the optimal locations based on where they were being sold and the where the raw materials were mined or recycled.

Suddenly making all our electronics more expensive by (maybe) a factor of 10 would be enormously disruptive so I suggest ramping up the tax gradually over a period of (say) 10 years or so. That would lessen the blow and give the US manufacturing companies some time to bootstrap. It also gives a great incentive to China to improve working conditions and emissions since doing so essentially wouldn't cost them anything (it would be covered by the corresponding reductions in import taxes).

In the long term, I would expect that the final cost of the manufactured products in question would stay about the same or even become cheaper than what they would be without these taxes. That's because as it becomes more expensive to hire people to do a menial job, it becomes more cost-effective to automate that job. The machine costs more to begin with (you need clever people to build and program it) but once it's up and running the unit cost per produced item is much lower.

There's a lot more to it than that, of course - China still has a big advantage in the expertise it has developed in building things, China sells to other places besides the US, and there are currency, debt, and trade treaty issues which further complicate matters in ways I don't completely understand. Still, I think it's an interesting idea to consider.