FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
Forum index » Science and Technology » Math
Mathematical formula to compare integer sequence
Post new topic   Reply to topic Page 1 of 1 [14 Posts] View previous topic :: View next topic
Author Message
Abstract Dissonance
science forum Guru Wannabe


Joined: 29 Dec 2005
Posts: 201

PostPosted: Tue Jun 27, 2006 8:09 pm    Post subject: Mathematical formula to compare integer sequence Reply with quote

Is there a way to compare two integer sequences that are permutations of
each other?

I'm looking for a function on a subset of N that is "permutation invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Ofcourse there are many "algorithmic" ways to do this such as sorting then
comparing element wise but I'm wondering if there is a better way.

I was thinking maybe one could have a polynomial such as

sum(Product(|a_k - b_j|,j=1..n),k=1..n)

which will give zero if a and b are permutations of each other(by which I
mean a_k = b_i_k for some permutation i_k of the index k).

I'm not completely convinced this gives what I want though, although I think
it might(as I it just came off the top of my head as I was writing this).

I think this will work and will do some more investigating or maybe try and
prove it(I it should be easy if it works)... but maybe there are other ways
too?

Jon
Back to top
Abstract Dissonance
science forum Guru Wannabe


Joined: 29 Dec 2005
Posts: 201

PostPosted: Tue Jun 27, 2006 8:40 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

I guess I should mention that the sequences I am talking about will not
contain 0.
Back to top
Tim Peters
science forum Guru


Joined: 30 Apr 2005
Posts: 426

PostPosted: Tue Jun 27, 2006 8:47 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

[Abstract Dissonance]
Quote:
Is there a way to compare two integer sequences that are permutations of
each other?

I think you're asking about ways to tell _whether_ one is a permutation of
the other.

Quote:
I'm looking for a function on a subset of N that is "permutation
invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Of course there are many "algorithmic" ways to do this such as sorting
then comparing element wise but I'm wondering if there is a better way.

Algorithmically, it would be better to convert them to multisets based on
hash tables. Then the expected time is linear in the number of elements,
which is optimal.

Quote:
I was thinking maybe one could have a polynomial such as

sum(Product(|a_k - b_j|,j=1..n),k=1..n)

which will give zero if a and b are permutations of each other(by which I
mean a_k = b_i_k for some permutation i_k of the index k).

Except it also returns 0 for:

a = [1,1,1,2], b = [1,2,3,4]

What it's really telling you is whether set(a) is a subset of set(b) This
part:

Product(|a_k - b_j|,j=1..n)

returns 0 iff a_k appears somewhere in b, so the sum returns 0 iff each
element of a appears somewhere in b, which is a subset (ahd without regard
to multiplicity) test.

If you could make it work, it would take time quadratic in the number of
elements. You didn't define what "better" means to you, but in most of my
worlds that's not good ;-)

Quote:
I'm not completely convinced this gives what I want though, although I
think it might(as I it just came off the top of my head as I was
writing this). I think this will work and will do some more investigating
or maybe try and prove it(I it should be easy if it works)... but maybe
there are other ways too?

Do define "better" first. You can, e.g., get simple but absurdly
impractical solutions via stuff like:

f(a) = product(p_a_i, i=1..n)

where p_a_i denotes the a_i'th prime. For example,

f([1,1,1,2]) = 2*2*2*3 = 24
and
f([2,1,1,1]) = 3*2*2*2 = 24
and
f([1,2,1,2]) = 2*3*2*3 = 36
and
f([1,2,3,4]) = 2*3*5*7 = 210

etc. It's easy then to prove that f(a) = f(b) if and only if a is a
permutation of b. In effect, this _is_ building a hash-based multiset
representation, but in a spectacularly impractical way.
Back to top
Abstract Dissonance
science forum Guru Wannabe


Joined: 29 Dec 2005
Posts: 201

PostPosted: Tue Jun 27, 2006 9:08 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

"Tim Peters" <tim.one@comcast.net> wrote in message
news:8vKdnQjZ7cv1BDzZnZ2dnUVZ_oKdnZ2d@comcast.com...
Quote:
[Abstract Dissonance]
Is there a way to compare two integer sequences that are permutations of
each other?

I think you're asking about ways to tell _whether_ one is a permutation of
the other.

I'm looking for a function on a subset of N that is "permutation
invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Of course there are many "algorithmic" ways to do this such as sorting
then comparing element wise but I'm wondering if there is a better way.

Algorithmically, it would be better to convert them to multisets based on
hash tables. Then the expected time is linear in the number of elements,
which is optimal.

I was thinking maybe one could have a polynomial such as

sum(Product(|a_k - b_j|,j=1..n),k=1..n)

which will give zero if a and b are permutations of each other(by which I
mean a_k = b_i_k for some permutation i_k of the index k).

Except it also returns 0 for:

a = [1,1,1,2], b = [1,2,3,4]

What it's really telling you is whether set(a) is a subset of set(b) This
part:

Product(|a_k - b_j|,j=1..n)

returns 0 iff a_k appears somewhere in b, so the sum returns 0 iff each
element of a appears somewhere in b, which is a subset (ahd without regard
to multiplicity) test.


yeah ;/

Quote:
If you could make it work, it would take time quadratic in the number of
elements. You didn't define what "better" means to you, but in most of my
worlds that's not good ;-)


What I mean is something that is simple to implement. Doesn't necessarily
have to be fast as I'm only using it sequences of small size. I was going to
just sort them then compare element wise as thats pretty easy to implement
but I was curious as to if there was a mathematical function that would do
the same thing. Similar to what I shown above but that doesn't work ;/ It
might be able to be modified somewhat though

Product(|#(a_k)*a_k - #(b_k)*b_j|,j=1..n)

where #(a_k) returns the number of duplicates of the element a_k in the
sequence. I think that might fix the problem and is sorta like using the
multiset idea?


Quote:
I'm not completely convinced this gives what I want though, although I
think it might(as I it just came off the top of my head as I was
writing this). I think this will work and will do some more
investigating
or maybe try and prove it(I it should be easy if it works)... but maybe
there are other ways too?

Do define "better" first. You can, e.g., get simple but absurdly
impractical solutions via stuff like:

f(a) = product(p_a_i, i=1..n)

where p_a_i denotes the a_i'th prime. For example,

f([1,1,1,2]) = 2*2*2*3 = 24
and
f([2,1,1,1]) = 3*2*2*2 = 24
and
f([1,2,1,2]) = 2*3*2*3 = 36
and
f([1,2,3,4]) = 2*3*5*7 = 210

etc. It's easy then to prove that f(a) = f(b) if and only if a is a
permutation of b. In effect, this _is_ building a hash-based multiset
representation, but in a spectacularly impractical way.

lol... its not that impratical! Wink I could use a look up table for the
primes and it would be quite fast. Ofcourse I don't really want to have to
read in an array of primes to do a simple computation. Its not that it can't
be done in a simple way using algorithms in C++ but just that I was
wondering if there was a simple mathematical way. You have given a pretty
simple one in some sense. It is based on primes though which makes it a tad
more complicated. Maybe there is another way that doesn't have some hidden
complexities involved(i.e., finding primes) but is based strictly on
algebra?

similar to(if it works).

sum(Product(|#(a_k)*a_k - #(b_k)*b_j|,j=1..n),k=1..n)

but even the # function is sorta "algorithmic".... (ofcourse one can say
that sum and product are too but there seems to be some difference between
the two... one involves "searching and comparing" while the other is purely
computational).

I'm not to worried about the time complexity though as I'm more interesting
in its "algorithmic" simplicity(basically just algebra).

Thanks,
Jon
Back to top
guenther.vonKnakspott@gmx
science forum Guru Wannabe


Joined: 24 Apr 2005
Posts: 250

PostPosted: Tue Jun 27, 2006 9:18 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

Abstract Dissonance wrote:
Quote:
Is there a way to compare two integer sequences that are permutations of
each other?

I'm looking for a function on a subset of N that is "permutation invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Ofcourse there are many "algorithmic" ways to do this such as sorting then
comparing element wise but I'm wondering if there is a better way.

I was thinking maybe one could have a polynomial such as

sum(Product(|a_k - b_j|,j=1..n),k=1..n)

which will give zero if a and b are permutations of each other(by which I
mean a_k = b_i_k for some permutation i_k of the index k).

I'm not completely convinced this gives what I want though, although I think
it might(as I it just came off the top of my head as I was writing this).

I think this will work and will do some more investigating or maybe try and
prove it(I it should be easy if it works)... but maybe there are other ways
too?

Jon
if A is a subset of IN and A={a0,a1,...,aq}, and f a function from IN^q

-->IN with F(A)=2^a0+2^a1+...+2^aq then F(A)=F(B) <=>A=B.
Regards.
Back to top
Tim Peters
science forum Guru


Joined: 30 Apr 2005
Posts: 426

PostPosted: Tue Jun 27, 2006 10:17 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

[Abstract Dissonance]
Quote:
Is there a way to compare two integer sequences that are permutations of
each other?

I'm looking for a function on a subset of N that is "permutation
?? invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

...

[guenther vonKnakspot]
Quote:
if A is a subset of IN and A={a0,a1,...,aq}, and f a function from IN^q
-->IN with F(A)=2^a0+2^a1+...+2^aq then F(A)=F(B) <=>A=B.

While he said "subset of N", his examples make clear he's really interested
in elements of N^i (for some i). That is, duplicates are allowed. The
function you suggest can fail then, e.g. F([1, 1, 4]) = F([2, 3, 3]).

If he can put an /a priori/ bound M on the maximum number of times an
element can appear, and has "big enough" integers, then the similar

G(a) = sum(2^(k*a_i))

works, where k = ceiling(log(M+1)/log(2)). That is, the integer is viewed
as a catenation of k-bit counts, where k bits is the minimum needed to hold
M.
Back to top
Dave Rusin
science forum Guru


Joined: 25 Mar 2005
Posts: 487

PostPosted: Wed Jun 28, 2006 12:49 am    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

In article <12a343b588impc7@corp.supernews.com>,
Abstract Dissonance <Abstract.Dissonance@hotmail.com> wrote:

Quote:
I'm looking for a function on a subset of N that is "permutation invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Two sequences of numbers can only be equivalent under a permutation if
they have the same cardinality n .

Two sequences of numbers of the same cardinality n are permutations
of each other iff the first n elementary symmetric functions of the
numbers are equal. That is, they must have the same sum, the same
sum-of-products-taken-two-at-a-time, ..., and the same product.
It is equivalent (in characteristic zero) to say that they have
the same sums of powers, that is, the two sums must be equal, the
two sums of the squares must be equal, ... and the sums of the n-th
powers must be equal.

One more equivalence: the sequences are permutations of each other
iff the polynomials \prod ( X - a_i ) and \prod( X - b_i ) are
equal. You can compute these as the terms of the sequence go
streaming by, that is, you don't even have to have possession of
the whole sequence at any one time (nor indeed even know its length
when you start).

As a practical matter, these techniques can be difficult to use for
sequences of real numbers; for example the polynomial
(X-1)(X-2)...(X-20)
has integer coefficients but they are as large as 20! and so you may
not be able to store them exactly with a naive algorithm. Rounding each
to a 10-digit floating-point number gives another polynomial that
doesn't even have 20 real roots! (Maple finds real roots only near
1, 2, 3, 4, 5, and 6.8 ). Since you specified INTEGER sequences you
will have no roundoff error, but you will have to do something to
handle potentially large integers. It's probably easiest to compute
several values of the form \sum_i (a_i)^j mod p_k for
j = 1, 2, ..., n where the p_k range over a sufficiently large
set of smallish primes.

What makes for an optimal implementation will depend on things like
the sizes of n and the a_i, as well as a sense of the likelihood
that the sequences really will be permutations of each other, and
a sense of the ways in which they will "resemble" each other when
they are actually not simply rearrangements of each other.

You can if you wish combine the multiple things I have suggested you
compute into a single integer (that is, I am recommending a function
f which maps sequences to _sequences_, whereas you seem to want
f([a1, ..., a_n]) to be a _number_). Clearly such an approach is of
limited utility: if you want f(a) to be (1) an integer, (2) different
when a and b are not permutations of each other, and (3) small
enough to be worked easily, then it will only be possible to use f
on a small set of possible input sequences (e.g. 2^8 of them if f(a)
is to be a single byte).

dave
Back to top
guenther.vonKnakspott@gmx
science forum Guru Wannabe


Joined: 24 Apr 2005
Posts: 250

PostPosted: Wed Jun 28, 2006 4:44 am    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

Tim Peters wrote:
Quote:
[Abstract Dissonance]
Is there a way to compare two integer sequences that are permutations of
each other?

I'm looking for a function on a subset of N that is "permutation
?? invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

...

[guenther vonKnakspot]
if A is a subset of IN and A={a0,a1,...,aq}, and f a function from IN^q
-->IN with F(A)=2^a0+2^a1+...+2^aq then F(A)=F(B) <=>A=B.

While he said "subset of N", his examples make clear he's really interested
in elements of N^i (for some i). That is, duplicates are allowed. The
function you suggest can fail then, e.g. F([1, 1, 4]) = F([2, 3, 3]).

If he can put an /a priori/ bound M on the maximum number of times an
element can appear, and has "big enough" integers, then the similar

G(a) = sum(2^(k*a_i))

works, where k = ceiling(log(M+1)/log(2)). That is, the integer is viewed
as a catenation of k-bit counts, where k bits is the minimum needed to hold
M.
You are right of course, he doesen't really mean subsets. I was too

sloppy and lazy to distinguish between A first as a subset and later A
as an q-tuple and therefore didn't bother to take a closer look at his
examples.
Regards.
Back to top
Abstract Dissonance
science forum Guru Wannabe


Joined: 29 Dec 2005
Posts: 201

PostPosted: Wed Jun 28, 2006 5:36 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

"Dave Rusin" <rusin@vesuvius.math.niu.edu> wrote in message
news:e7sjng$7ue$1@news.math.niu.edu...
Quote:
In article <12a343b588impc7@corp.supernews.com>,
Abstract Dissonance <Abstract.Dissonance@hotmail.com> wrote:

I'm looking for a function on a subset of N that is "permutation
invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Two sequences of numbers can only be equivalent under a permutation if
they have the same cardinality n .


Ofcourse ;)

Quote:
Two sequences of numbers of the same cardinality n are permutations
of each other iff the first n elementary symmetric functions of the
numbers are equal. That is, they must have the same sum, the same
sum-of-products-taken-two-at-a-time, ..., and the same product.
It is equivalent (in characteristic zero) to say that they have
the same sums of powers, that is, the two sums must be equal, the
two sums of the squares must be equal, ... and the sums of the n-th
powers must be equal.


I think I recall something similar when I took group theory... is that where
it comes from? Cyclotomic pops in my mind for some reason. I guess I can
go look in my book and see.

Quote:
One more equivalence: the sequences are permutations of each other
iff the polynomials \prod ( X - a_i ) and \prod( X - b_i ) are
equal. You can compute these as the terms of the sequence go
streaming by, that is, you don't even have to have possession of
the whole sequence at any one time (nor indeed even know its length
when you start).

As a practical matter, these techniques can be difficult to use for
sequences of real numbers; for example the polynomial
(X-1)(X-2)...(X-20)
has integer coefficients but they are as large as 20! and so you may
not be able to store them exactly with a naive algorithm. Rounding each
to a 10-digit floating-point number gives another polynomial that
doesn't even have 20 real roots! (Maple finds real roots only near
1, 2, 3, 4, 5, and 6.8 ). Since you specified INTEGER sequences you
will have no roundoff error, but you will have to do something to
handle potentially large integers. It's probably easiest to compute
several values of the form \sum_i (a_i)^j mod p_k for
j = 1, 2, ..., n where the p_k range over a sufficiently large
set of smallish primes.

What makes for an optimal implementation will depend on things like
the sizes of n and the a_i, as well as a sense of the likelihood
that the sequences really will be permutations of each other, and
a sense of the ways in which they will "resemble" each other when
they are actually not simply rearrangements of each other.

You can if you wish combine the multiple things I have suggested you
compute into a single integer (that is, I am recommending a function
f which maps sequences to _sequences_, whereas you seem to want
f([a1, ..., a_n]) to be a _number_). Clearly such an approach is of
limited utility: if you want f(a) to be (1) an integer, (2) different
when a and b are not permutations of each other, and (3) small
enough to be worked easily, then it will only be possible to use f
on a small set of possible input sequences (e.g. 2^8 of them if f(a)
is to be a single byte).


I wasn't plan on using large sequences... n being < 20 which would be the
max(a_k) too. I was working with partitions of an integer and come up with
some algorithm to generate them but it turns out the algorithm doesn't
generate all the partitions.

e.g., suppose I want to partition 5,

5
4 + 1
3 + 2, 3 + 1 + 1
2 + 3, 2 + 2 + 1, 2 + 1 + 1 + 1
1 + 4, 1 + 3 + 1, 1 + 2 + 2, 1 + 2 + 1 + 1, 1 + 1 + 1 + 1 + 1


where basically you start from the second term and carry over a 1 from it
into the next term unless it would make the term larger which then it carry
over to the term after that.

So as you can see, in the above generation there are duplicates. Such as 1
+ 4 = 4 + 1 and I wanted an easy way to remove them. Since I wrote a program
to do this I had the terms in arrays and I could just use the built in sort
method to sort them then compare term by term... but it led me to think if
there was some "mathematical" and "non-algorithmic" way.

Figuring that I could compute some type of "characteristic" for each
partition then just remove the duplicate entries with the same
characteristics.

What I was going to see is what kinda relationship there was between the
total number of partitions generated(including duplicates) by this algorithm
with the actual number of partitions(excluding duplicates).

I think I messed up the implementation of the algorithm because it didn't
generate all the partitions for n > 7. I think because I went the wrong way
when when looking for the largest element to carry from. I'll try to fix it
later and see what happens.

Thanks,
Jon
Back to top
mike3
science forum addict


Joined: 27 May 2005
Posts: 52

PostPosted: Wed Jun 28, 2006 6:56 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

Abstract Dissonance wrote:
Quote:
Is there a way to compare two integer sequences that are permutations of
each other?

I'm looking for a function on a subset of N that is "permutation invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)


Well, adding up all the elements seems to work, but can produce
duplicates,
for example the sequences [2,3,1], [1,2,3], and [1,4,1] are all
identified as the
"same". Do you want the first 2 to be recognized as distinct from the
other,
but those 2 themselves as equivalent to each other?

Quote:
Ofcourse there are many "algorithmic" ways to do this such as sorting then
comparing element wise but I'm wondering if there is a better way.


Yep, but those can still be used to define a mathematical function.

Quote:
I was thinking maybe one could have a polynomial such as

sum(Product(|a_k - b_j|,j=1..n),k=1..n)

which will give zero if a and b are permutations of each other(by which I
mean a_k = b_i_k for some permutation i_k of the index k).


So you want a function that compares two sequences, regardless of
order,
returns one result, like 0, if the two are permutations of each other,
or are
identical, but returns another, like 1, if they are not, and does that
100%
of the time?

Quote:
I'm not completely convinced this gives what I want though, although I think
it might(as I it just came off the top of my head as I was writing this).


It seems to work, but I'm not quite sure. Let's see, though. OK, in
order for it
to be "0" the product prod_{j=1...n} |a_k - b_j| must contain at least
one point
at which a_k = b_j (ie. one zero). For the sum to be zero, then either
all
the terms have to be or some have to be negative. The latter is
impossible,
because the product is over all the *absolute values* of the difference
a_k - b_j, and there's no sign-changing things in there (it doesn't
alternate
like +-+-+-+-... for example), so if it sums to 0, then *every* product
must
be 0, which means that for each b_j there is an a_k such that b_j - a_k
= 0
(or, for each a_k there is a b_j such that b_j - a_k = 0). This
occurence must
exist in every term of the sum, so every one of the products
prod_{j=1...n} |a_k - b_j| = 0 for all k. Therefore, we have b_j = a_k
for
1 j and every k. Since k goes through all the possible values of j, we
acquire the condition that no j has to be "safe", ie. b_j != a_k for
one of the
k values, in order for it to be 0. Therefore, the function is 0 if, for
every k,
there exists a j such that b_j - a_k = 0. Since j and k are going both
through
(1...n), then, the sequence b_1, b_2, ... b_n must have, for some
function
f(n), contain the elements a_1, a_2, ..., a_n (ie. b_f(1) = a_1, b_f(2)
= a_2,
...., b_f(n) = a_n.), and the only case this can happen is if the two
are
permutations of each other. I'm not quite sure how to make this
argument
more "rigorous", so any comments/critique will be appreciated.

Quote:
I think this will work and will do some more investigating or maybe try and
prove it(I it should be easy if it works)... but maybe there are other ways
too?

Jon
Back to top
Dann Corbit
science forum beginner


Joined: 02 Jun 2006
Posts: 47

PostPosted: Wed Jun 28, 2006 7:21 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

Quote:
Abstract Dissonance wrote:
Is there a way to compare two integer sequences that are permutations of
each other?

I'm looking for a function on a subset of N that is "permutation
invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)


If the sets are small (e.g. if no element is bigger than 63) a nice way
would be to tag the bits in a 64 bit integer and then just use integer
comparison.

If the sets are arbitrary, then sort the sets. It's brutal, but it would
work.

What is the real problem that you are trying to solve and what do the sets
look like?

[snip]
Back to top
Tim Peters
science forum Guru


Joined: 30 Apr 2005
Posts: 426

PostPosted: Fri Jun 30, 2006 2:17 am    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

[Abstract Dissonance, trying to tell whether one sequence is a
permutation of another]
Quote:
...
What I mean is something that is simple to implement. Doesn't
necessarily have to be fast as I'm only using it sequences of
small size.

Then you have to define the implementation language first Smile. For example,
I usually reach for Python when experimenting, and the only approach I'd
even consider for small sequences in that language is the obvious
("sorted()" and sequence comparison are built in):

def is_permutation(a, b):
return sorted(a) == sorted(b)

Quote:
I was going to just sort them then compare element wise as thats
pretty easy to implement

Depends on the language, but so long as sorting is built in that gives an
"obviously correct" approach. In programming and math, "obviously correct"
usually beats "not obviously wrong", and both always beat "WTF?!" ;-)

Quote:
but I was curious as to if there was a mathematical function that would
do the same thing. Similar to what I shown above but that doesn't work ;/
It might be able to be modified somewhat though

Product(|#(a_k)*a_k - #(b_k)*b_j|,j=1..n)

where #(a_k) returns the number of duplicates of the element a_k in the
sequence. I think that might fix the problem and is sorta like using
the multiset idea?

Yup, if done right. I wouldn't try to make this work, because it's both
more obscure and less efficient than the sorting approach.

[Tim Peters]
Quote:
...
Do define "better" first. You can, e.g., get simple but absurdly
impractical solutions via stuff like:

f(a) = product(p_a_i, i=1..n)

where p_a_i denotes the a_i'th prime. For example,

f([1,1,1,2]) = 2*2*2*3 = 24
and
f([2,1,1,1]) = 3*2*2*2 = 24
and
f([1,2,1,2]) = 2*3*2*3 = 36
and
f([1,2,3,4]) = 2*3*5*7 = 210

etc. It's easy then to prove that f(a) = f(b) if and only if a is a
permutation of b. In effect, this _is_ building a hash-based multiset
representation, but in a spectacularly impractical way.

lol... its not that impratical! Wink I could use a look up table for the
primes and it would be quite fast.

Are you sure? You later said you can have up to 20 integers each as large
as 20, so the product could get as large as:

p_20^20 = 71^20 = 10596610576391421032662867140133202401

Again this depends on the language. There's no bound on integer size in
Python, but in most languages you'll have to muck with "big integer"
packages if you need more than 64 bits.

Dave Rusin's idea of building a polynomial is also elegant, and doesn't
require a table of primes, but in computing Product(x-a_i) the constant term
can get as large as 20^20 = 104857600000000000000000000, and again you're
outside the range of 64-bit integers.

Quote:
Ofcourse I don't really want to have to read in an array of primes
to do a simple computation. Its not that it can't be done in a simple
way using algorithms in C++ but just that I was wondering if there was
a simple mathematical way.

You have given a pretty simple one in some sense. It is based on primes
though which makes it a tad more complicated. Maybe there is another
way that doesn't have some hidden complexities involved(i.e., finding
primes) but is based strictly on algebra?

sum(1 << (5*a_i)) would be correct (if an element can appear no more than 31
times) and blazing fast in C++ -- if it had 105-bit integers.

Quote:
similar to(if it works).

sum(Product(|#(a_k)*a_k - #(b_k)*b_j|,j=1..n),k=1..n)

but even the # function is sorta "algorithmic".... (ofcourse one can say
that sum and product are too but there seems to be some difference
between the two... one involves "searching and comparing" while the other
is purely computational).

If you peek under the covers and look at the generated machine code, you'll
probably find that your absolute values generate compares and branches too.

Quote:
I'm not to worried about the time complexity though as I'm more
interesting in its "algorithmic" simplicity(basically just algebra).

So let's respell sum(1 << (5*a_i)) as:

H(a) = sum(32^a_i)

Then so long as each a_i is a non-negative integer, and no a_i appears more
than 31 times, H(a) = H(b) iff a is a permutation of b.

OK, I have to admit that I find the nested sum/product thingies plain ugly.
If you're _assuming_ from the start that a and b both have length n, then
it's enough that each element in a appear exactly the same number of times
in a as in b. At least let yourself use the Kronecker delta function,

d(i, j) = 1 if i = j, or 0 if i =/= j

In C classic, d(i, j) is implemented simply as "i == j".

Then the number of times a_i appears in a is

sum(d(a_i, a_j), j=1,n)

and the number of times a_i appears in b is similarly

sum(d(a_i, b_j), j=1,n)

so

d(sum(d(a_i, a_j), j=1,n), sum(d(a_i, b_j), j=1,n))

is 1 if a_i appears the same number of times in a and b, and 0 if not, so a
is a permutation of b iff

sum(d(sum(d(a_i, a_j), j=1,n), sum(d(a_i, b_j), j=1,n)), i=1,n) = n

But that's nuts Wink
Back to top
Rob Pratt
science forum beginner


Joined: 22 Jun 2005
Posts: 22

PostPosted: Fri Jun 30, 2006 5:36 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

"Abstract Dissonance" <Abstract.Dissonance@hotmail.com> wrote in message
news:12a5fg38k73sv99@corp.supernews.com...
Quote:

"Dave Rusin" <rusin@vesuvius.math.niu.edu> wrote in message
news:e7sjng$7ue$1@news.math.niu.edu...
In article <12a343b588impc7@corp.supernews.com>,
Abstract Dissonance <Abstract.Dissonance@hotmail.com> wrote:

I'm looking for a function on a subset of N that is "permutation
invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Two sequences of numbers can only be equivalent under a permutation if
they have the same cardinality n .


Ofcourse ;)

Two sequences of numbers of the same cardinality n are permutations
of each other iff the first n elementary symmetric functions of the
numbers are equal. That is, they must have the same sum, the same
sum-of-products-taken-two-at-a-time, ..., and the same product.
It is equivalent (in characteristic zero) to say that they have
the same sums of powers, that is, the two sums must be equal, the
two sums of the squares must be equal, ... and the sums of the n-th
powers must be equal.


I think I recall something similar when I took group theory... is that
where it comes from? Cyclotomic pops in my mind for some reason. I guess
I can go look in my book and see.

One more equivalence: the sequences are permutations of each other
iff the polynomials \prod ( X - a_i ) and \prod( X - b_i ) are
equal. You can compute these as the terms of the sequence go
streaming by, that is, you don't even have to have possession of
the whole sequence at any one time (nor indeed even know its length
when you start).

As a practical matter, these techniques can be difficult to use for
sequences of real numbers; for example the polynomial
(X-1)(X-2)...(X-20)
has integer coefficients but they are as large as 20! and so you may
not be able to store them exactly with a naive algorithm. Rounding each
to a 10-digit floating-point number gives another polynomial that
doesn't even have 20 real roots! (Maple finds real roots only near
1, 2, 3, 4, 5, and 6.8 ). Since you specified INTEGER sequences you
will have no roundoff error, but you will have to do something to
handle potentially large integers. It's probably easiest to compute
several values of the form \sum_i (a_i)^j mod p_k for
j = 1, 2, ..., n where the p_k range over a sufficiently large
set of smallish primes.

What makes for an optimal implementation will depend on things like
the sizes of n and the a_i, as well as a sense of the likelihood
that the sequences really will be permutations of each other, and
a sense of the ways in which they will "resemble" each other when
they are actually not simply rearrangements of each other.

You can if you wish combine the multiple things I have suggested you
compute into a single integer (that is, I am recommending a function
f which maps sequences to _sequences_, whereas you seem to want
f([a1, ..., a_n]) to be a _number_). Clearly such an approach is of
limited utility: if you want f(a) to be (1) an integer, (2) different
when a and b are not permutations of each other, and (3) small
enough to be worked easily, then it will only be possible to use f
on a small set of possible input sequences (e.g. 2^8 of them if f(a)
is to be a single byte).


I wasn't plan on using large sequences... n being < 20 which would be the
max(a_k) too. I was working with partitions of an integer and come up
with some algorithm to generate them but it turns out the algorithm
doesn't generate all the partitions.

e.g., suppose I want to partition 5,

5
4 + 1
3 + 2, 3 + 1 + 1
2 + 3, 2 + 2 + 1, 2 + 1 + 1 + 1
1 + 4, 1 + 3 + 1, 1 + 2 + 2, 1 + 2 + 1 + 1, 1 + 1 + 1 + 1 + 1


where basically you start from the second term and carry over a 1 from it
into the next term unless it would make the term larger which then it
carry over to the term after that.

So as you can see, in the above generation there are duplicates. Such as
1 + 4 = 4 + 1 and I wanted an easy way to remove them. Since I wrote a
program to do this I had the terms in arrays and I could just use the
built in sort method to sort them then compare term by term... but it led
me to think if there was some "mathematical" and "non-algorithmic" way.

Figuring that I could compute some type of "characteristic" for each
partition then just remove the duplicate entries with the same
characteristics.

What I was going to see is what kinda relationship there was between the
total number of partitions generated(including duplicates) by this
algorithm with the actual number of partitions(excluding duplicates).

I think I messed up the implementation of the algorithm because it didn't
generate all the partitions for n > 7. I think because I went the wrong
way when when looking for the largest element to carry from. I'll try to
fix it later and see what happens.

Thanks,
Jon


Instead of generating duplicates and then removing them, why not generate
just the canonical partitions in the first place? Let p(n,k) be the number
of partitions of n into parts of size at most k. Then the following
recurrence relation, which can be obtained by conditioning on whether a part
of size k appears, suggests an algorithm.

p(n,k) = p(n-k,k) + p(n,k-1)

You want p(n,n).


Rob Pratt
Back to top
Tim Peters
science forum Guru


Joined: 30 Apr 2005
Posts: 426

PostPosted: Fri Jun 30, 2006 9:12 pm    Post subject: Re: Mathematical formula to compare integer sequence Reply with quote

[Abstract Dissonance]
Quote:
...
I wasn't plan on using large sequences... n being < 20 which would be
the max(a_k) too. I was working with partitions of an integer and come
up with some algorithm to generate them but it turns out the algorithm
doesn't generate all the partitions.

e.g., suppose I want to partition 5,

5
4 + 1
3 + 2, 3 + 1 + 1
2 + 3, 2 + 2 + 1, 2 + 1 + 1 + 1
1 + 4, 1 + 3 + 1, 1 + 2 + 2, 1 + 2 + 1 + 1, 1 + 1 + 1 + 1 + 1


where basically you start from the second term and carry over a 1 from
into the next term unless it would make the term larger which then it
carry over to the term after that.

So as you can see, in the above generation there are duplicates. Such
as 1 + 4 = 4 + 1 and I wanted an easy way to remove them. Since I wrote
a program to do this I had the terms in arrays and I could just use the
built in sort method to sort them then compare term by term... but it
led me to think if there was some "mathematical" and "non-algorithmic"
way.

...


[Rob Pratt]
Quote:
Instead of generating duplicates and then removing them, why not generate
just the canonical partitions in the first place? Let p(n,k) be the
number of partitions of n into parts of size at most k. Then the
following recurrence relation, which can be obtained by conditioning on
whether a part of size k appears, suggests an algorithm.

p(n,k) = p(n-k,k) + p(n,k-1)

You want p(n,n).

Sounds like a good idea to me too :-)

Jon, there are many ways to do this directly. Here's a short & sweet
recursive way due to David Eppstein, coded in Python:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/218332

I added the fancier iterative method I usually use in a comment down below
on that page: by representing a partition as a multiset (a mapping from
integers to the number of times they appear in a given partition of N), it's
possible to generate each successive partition in constant time (independent
of N!), and using worst-case space proportional to sqrt(N).

For example, using that function:

Quote:
for p in gen_partitions_ms(6):
.... print p


{6: 1}
{1: 1, 5: 1}
{2: 1, 4: 1}
{1: 2, 4: 1}
{3: 2}
{1: 1, 2: 1, 3: 1}
{1: 3, 3: 1}
{2: 3}
{1: 2, 2: 2}
{1: 4, 2: 1}
{1: 6}

or

Quote:
count = 0
for p in gen_partitions_ms(6000):
.... print p

.... count += 1
.... if count > 10:
.... break

{6000: 1}
{1: 1, 5999: 1}
{2: 1, 5998: 1}
{1: 2, 5998: 1}
{3: 1, 5997: 1}
{1: 1, 2: 1, 5997: 1}
{1: 3, 5997: 1}
{4: 1, 5996: 1}
{1: 1, 3: 1, 5996: 1}
{2: 2, 5996: 1}
{1: 2, 2: 1, 5996: 1}
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [14 Posts] View previous topic :: View next topic
The time now is Sat Jan 10, 2009 4:00 am | All times are GMT
Forum index » Science and Technology » Math
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts Compare and contrast physics and chem... parent Chem 0 Mon Jan 08, 2007 4:26 pm
No new posts sequence of polynomials artur_steiner@yahoo.com Math 1 Tue Jul 18, 2006 1:20 am
No new posts In which (ordered) spaces an increasi... lataianu bogdan Math 5 Mon Jul 17, 2006 7:11 pm
No new posts This Week's Finds in Mathematical Phy... John Baez Research 0 Mon Jul 17, 2006 3:32 pm
No new posts Stumped with figuring a formula... moriman Recreational 8 Mon Jul 17, 2006 12:21 am

Debt Consolidation | Bankruptcy | Bankruptcy | Mobile Phone | Bad Credit Mortgages
Copyright © 2004-2005 DeniX Solutions SRL
Other DeniX Solutions sites: Electronics forum |  Medicine forum |  Unix/Linux blog |  Unix/Linux documentation |  Unix/Linux forums


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 1.5753s ][ Queries: 16 (0.1609s) ][ GZIP on - Debug on ]