Minimal Keys and Antikeys
Minimal Keys and Antikeys
Minimal Keys and Antikeys
By V. D. THI
§ 1. Introduction
The relational model, defined by E. F. Codd [3] is one of the most investigated
data base models of the last years. Many papers have appeared concerning combina-
torial characterization of functional dependencies, systems of minimal keys and anti-
keys. A set of minimal keys and a set of antikeys form Sperner-systems. Sperner-sys-
tems and sets of minimal keys are equivalent in the sense that for an arbitrary Sperner-
system S a family of functional dependencies F can be constructed so that the mini-
mal keys of F a r e exactly the elements of S (cf. [4]).
In the present paper we propose some combinational algorithms to determine
antikeys and minimal keys. In the second part of the paper, we are going to study
connections between minimal keys and antikeys for special Sperner-systems.
We start with some necessary definitions.
Definition 1.1. Let £2 be a finite set, and denote P(Q) its power set. The mapping
F: P(i3)—P(i2) is called a closure operation over Q if, for every A, BQ Q,
(1) AQF(A) (extensivity),
(2) AQB implies F(A)QF(B) (monotonity),
(3) F(A)=F(F(A)) (idempotency).
In few cases Q is represented by the set {1, ..., n} or by the set of columns of an
mXn matrix M. If we use the second representation, a special closure operation F M
can be defined over the set of the columns of M :
The i-th column of M belongs to FM(A) if and only if for any two rows of M
which are identical on A they are equal on the i-th column, too. •
It is easy to see, that FM(A) is a closure operation. It is known (see [1]) that any
closure operation F over a finite set £2 can be represented by an appropriate matrix
M, that is we can choose M and represent Q by the set of the columns of M so that F
coincides with Fm-
Definition 1.2. Let F be a closure operation over Q, and AQQ. We say that
— A is a k e y of F, if F(A)=Q.
— A is a minimal key of F, if A is a key of F and for any BQ A, F(B)= Q
implies B=A, i.e. no proper subset of A is a key of F..
Let us denote by KF the set of all minimal keys of F. It is clear that KF forms a
Sperner-system.
362 V. D. Thi
We have to prove, that A9+-1={J?1, ..., Bq+l}~1. For this using the inductive
hypothesis K q ={B!, ...,i? 9 } _1 we show that
a) if AdKq+1 then A is the subset of Q not containing B, ((= 1, ...,q +1)
and being maximal for this property, i.e. A£{Bt, ..., i? 9 + 1 } - 1 ,
b) every AQQ not containing the elements B, (t=1, ...,q+1) and being maxi-
mal for this property is an element of K q + 1 . First we prove the validity of (a). Let
A£Kq+1. If A£Fq then A does not contain the elements Bt (t= 1, ..., q) and A is
maximal for this property and at the same time Bq+1£A. Consequently, A is a
maximal subset of Q not containing B, (<=1, ..., <7+1).
Let AdKq+1\Fq. It is clear that there is an A1, (1 si^p and 1 such
that A—Aj. Our construction shows that B,A\ for all / (/= 1, ..., qr+1). Because
A\ is an antikey of {i?9+i} for Xt we obtain y 4 j = f o r some b£Bq+l. It is
obvious that Bq+1QAi,(j{b}. If a£Q\Xi then, by the inductive hypothesis, for
AitU{a,b}=XiU{a} there exists Bs (s=l,...,q) such that Bs<gA\\J{a,b). X,
does not contain Bu...,Bq by X£Kq. Hence a£Bs. If then
BsQA't\J{a}. For every Bs (1 ^s^q) with ^ U f a } and BS%A} we have
b(LBs. Hence Bs\{a, b}^Alt. Consequently, there exists an A^Fq such that;
Minimal keys and antikeys 363
Next we turn to the proof of (b). Suppose that A is the maximal subset of £2 not
containing Bt (1 1). By the inductive hypothesis, there is a Y£Kq such that
AQY.
The first case: If Bq+1^Y then Ydoes not contain Bt, ..., Bq+l. Because A is
the maximal subset of Q not containing B, (1 +1) we obtain A=Y. Bq+1%Y
implies A£Fq. Consequently, we have A£Kq+1.
The second case: If Bq+1c Y then Y=Xt holds for some i in {1, ...,p} and
AQA} holds for some t in {1, ...,-r,}. If there exists an A1^Fq such that AitcA1,
then we also have AcAt. By the definition of Fq it is clear that A1 does not contain
Bi, ..., Bq+1. This contradicts the definition of A. Hence A't£Kq+1. It is easy to see
that A\ does not contain Bl, ..., Bq+1. By the definition of A we obtain A = A\,
i.e. Kq+1={Bt, ..., 2? i+1 } -1 .
By the above proof it is clear that Km={Blt ..., Bm}~1. Thus we have
Theorem 2.2. K ^ K ' 1 .
Because AT and are uniquely determined by each other, the determination of
K_1 based on our algorithm does not depend on the order of Bl, ...,Bm.
Now we assume that the elementary step being counted is the comparison of two
attribute names. Consequently, if we assume that subsets of Q are represented as sor-
ted lists of attribute names, then a Boolean operation on two subsets of Q requires
at most |i2| elementary steps.
Let K0 = {O}. According to the construction of our algorithm we have Kq=
^ i ^ U l Z j , ..., X, }, where l^q^m—l. Denote lq the number of elements of Kq.
It is clear that for constructing Kq+1 the worst-case time of algorithm is 0(n2(lq—
-tq)tq) if tq<lq and 0(n2tq) if 1q = tq- Consequently, the total time spent by the
algorithm in the worst cases is
Ш-1
o(n2 2 ',«,)» w h e r e
I= n
-
4=1
2 " + l/2
. Consequently, the worst-case time of our algorithm can not be more than
(n-nf2
exponential in the number of attributes.
Let K~1={A1, ..., A,} be a set of antikeys. Let R={h0, hx, ..., h,} be a rela-
tion over Î2 given as follows: for all aÇQ, h(a)=0
0 if a£A„
for ; (1 i s I), /I,(a) = { .
if aÇQ\Ar
If we consider R as a matrix, then R represents K (see [5]). Thus, based on our algo-
rithm, for an arbitrarily given Sperner-system K, we can construct a matrix which
represents K.
Example 2.3. Let Q = {1, 2, 3, 4, 5, 6} and K= {(2, 3, 4), (1, 4)}. According to
the above algorithm we have ^ = { ( 1 , 3,4, 5, 6), (1, 2, 4, 5, 6)} U F1, where Fx=
= {(1,2,3,5,6)}, and A, = {(3, 4, 5, 6), (2, 4, 5, 6), (1,2, 3, 5, 6)}. It is obvious
that K~X=K2.
We consider the following matrix:
The attributes:
1 2 3 4 5 6
'0 0 0 0 0 0'
M= 110 0 0 0
2 0 2 0 0 0
LO 0 0 3 0 0.
It is clear that M represents K.
Now we describe the "reverse" algorithm: for given Sperner-system considered
as the set of antikeys we construct its origin. The following definitions are necessary
for us.
Let F be a closure operation over Q. Set
Z(F) = {A g Q: F(A) = A)
The elements of Z(F) are called closed sets. It is clear that T(F) is the family of
maximal closed sets (except Q). Now we prove the following lemma.
Lemma 2.4. Let .Fbe a closure operation over Q, and KF the set of minimal keys
of F. Then KF1=T(F).
Proof. Let A be an arbitrary antikey and suppose that AcF(A). Hence
F(F(A))—F(A)=Q. Consequently, A is a key. This contradicts \/B^KF\ B%A.
If there is an A' such that Ac: A' and / f e Z ( F ) \ { i 2 } , then A' is a key. This con-
tradicts A'<zQ.
On the other hand, if A is a maximal closed set and there is a 2? (B£KF) such
that BQA, then F(A)=Q, which conflicts with the fact that AczQ. If AaD(DQ
Q Q), then it can be seen that F(D)= Q (because A is the maximal closed set). Con-
sequently, A is an antikey. The lemma is proved.
Minimal keys and antikeys 365
_ (Tq\{bq+1) if V ^ i i X G : Tq\{bq+1} i B„
TQ+1
I TQ otherwise.
Theorem 2.5. If H i s a set of antikeys, then {r o , 7\, ..., r m } are the keys and
Tm is a minimal key.
Proof. By Remark 2.1 there exists a closure F such that H=Kp1: We prove
the theorem by the induction. It is clear that T0 is a key. If Tq and —T q , then
it is obvious that Tq+1 is a key. If Tq+1=Tq\{bq+1} and F{Tq+1)^Q then, by
Lemma 2.4, there is a B£H such that F(Tq+1)QBt. Hence Tq+1QB„ which
conflicts with the fact MB£H: Tq+1<£Bt. Consequently, Tq+1 is a key.
Now suppose that A is a proper subset of Tm. If a$A, then, clearly, F(A)^Q.
If at: A, then there exists a bqeB such that bq£ Tm\A (1 ^q). By the given algorithm
there exists a B£H\G such that T^^bjQB,. We obtain r m \{fc 9 }g
g r f . 1 { 6 l } g j ( by TmQTq (O^q^m-l). Hence F(A)^Q. Consequently, Tm
is a minimal key. The theorem is proved.
Remark 2.6. Theorem 2.5 is also true if r 0 = { ^ i s is an
arbitrary key.
At this time define
= (Tq\{bq+1} if VBtH: Tq\{bq+1} ^ B,,
q+1
~\Tq otherwise.
— It is clear that the worst-case time of the algorithm is 0(« 2 • where
n=|i2|, \H\ is the number of elements of H.
— It is best to choose B such that |B | is minimal.
— If there is a B such that \/B£H\{B}: B,C\B=® and a£ |J B,
BtiH\{B)
then a\Jb is a minimal key (\/b£B).
— If (£2\ (J B,)^0, then a£Q\ (J B, is a minimal key.
Bt<LH B,£H
— Let Y= U S, If B \ V ^ 0 , then it is best to choose T0=
Bf ff
= ( J e n r ) U { a } U { 4 where b£B\Y.
Remark 2.7. Let H be a Sperner-system (Q$H) and AcQ. We can give an
algorithm (which is analogous to the above one) to decide whether A is a key or not.
If A is a key, then this algorithm finds an A' such that A'Q A and A' is a minimal key.
Remark 2.8. In the paper [5] the equality sets of the relation are defined as
follows: Let R={A l 5 ..., HM} be a relation over Q. For I^J, we denote by EIS
the set {a€i2: hi(a)=hJ(a)}, where l S / S m , l ^ ' s m . Now we define M =
= {ETJ : 3EPQ such that EI} c:E Pi }. Practically, it is possible that there are some ETJ
which are equal to each other. We choose one EU from M. According to Lemma 2.4
it can be seen that M is the set of antikeys of KFR (we consider R as a matrix).
366 V. D. Thi
— iMw/<2fc—i)]—i (2k-U
+P\
lfe-lj X[ fc-1 J if n = p (mod(2fc-l))
and
_ \\lnl(2k-1)]
(t-.'I ]><
(fe-l) if
" = p (mod (2fc-l>)
and k ^ p ^2k-2,
and the function / 2(i _ 2 for 2k—2^ n by
[ f e _jJ if « = 0 (mod(2fc—2)),
x lf n
= < U - l J ( fc-1 J = P (mod(2fc-2))
/¡a-aO) = <
and 1 S p S fc — 1,
f2k-2Yn'<2t-2)1 i f ) , ,
x lf P
t it — I J U - l J " = (mod(2fc-2))
and k^ps2k-3,
Minimal keys and antikeys 367
where N denotes the set of natural numbers. Let us take a partition Q—Xx U...
...UXmUW, where aw = J and \Xi\ = 2k-l (1 g f e r a ) .
Let
It is clear that
if 1 ^ \W\ ^ k—1.
if k^\W\^2k-2.
/2<fe 1
llim "- ---
^ _ I fc—l J _ (~T~J
U - l J U - l J
368 V. D. Thi
"~2fcJ p
2"/(a-i). ( c | ( e
|
I ]/7t(fc — 1) J I ]/n(k-2) )
ln £
= ¿ T (ln i1 - 4 ) +
2 F I 2 ( i l n («<*" 1 » " 2 4 ( F r T ) ) ]
and by
I t . \l
we have
n
Hence
2 ^ (T1" ^ " 1
- 24^1) ) - ¿ T >
Consequently, if n = 0 (mod (2k—2)(2k — l)), then /•¡,k-i(n)^'fik_2(n). Now let n
be an arbitrary natural number. It can be seen that, for a fixed k, there exists a number
0 such that
(2k-\+p\ ( p \
—L-k-i i - m V-1* <m
U - i J U - u
Minimal keys and antikeys 369
fik-ljn)
ftk-M °°'
n-t- 00
(It is easy to see that k=2 is also true.) The theorem is proved.
As a consequence of Theorem 2.12 and Theorem 2.10 we have
Corollary 2.13
FM S i2fik.1(n).
In this section we investigate connections between the minimal keys and antikeys
for some special Sperner-systems.
The notion of saturated Sperner-system is defined in [7], as follows :
A Sperner-system K over Q is saturated if for any AQQ, ATU {A} is not a
Sperner-system.
An important result in [7] has been proved; if K is a saturated Sperner-system
then K=KF uniquely determines F, where F is a closure operation.
Now we investigate some special Sperner-systems which are strictly connected
with saturated Sperner-systems.
We consider the following example.
Example 3.1. Let Q = {1, 2, 3, 4, 5, 6} and N= {(1, 2), (3, 4), (5, 6)} be a
Sperner-system. It can be seen that i V - 1 = { ( l , 3, 5), (1, 3, 6), (1,4, 5), (1,4, 6),
(2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)}. Let K^NUN'1. It is clear that K is saturated.
We use the algorithm which finds a set of antikeys. Then K_1= {(1, 3), (1, 4), (1, 5),
(1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 5), (3, 6), (4, 5), (4, 6)}.
By the fact that K~iU{ 1, 2} is a Sperner-system it is obvious that K-1 is not
saturated. Thus, we have
Corollary 3.2. There is a K so that K is saturated and K-1 is not saturated.
Now we define the following notion.
Definition 3.3. Let K be a Sperner-system over Q. We say that K is embedded,
if for every AÇ.K there is a BÇ.H such that AczB, where H~1=K. We have
Theorem 3.4. Let K be a Sperner-system over Q. K is saturated if and only if
K~l is embedded.
370 V. D. Thi
Acknowledgement
The author would like to take this opportunity to express deep gratitude to Professor J. De-
metrovics and L. Hannák for their help, valuable comments and suggestions.
References
[1] ARMSTRONG, W. W., Dependency Structures of Data Base Relationships, Information Processing
74, North-Holland Publ. Co. (1974) 580—583.
[2] BÉKÉSSY, A . , DEMETROVICS J., H A N N Á K L . , KATONA G . O. H . , F R A N K L P . , On the number of maxi-
mal dependencies in data relation of fixed order. Discrete Math., 30 (1980) 83—88.
[3] C O D D , E . F . , Relational model of data for large shared data banks. Communications of the ACM,
13, ( 1 9 7 0 ) 3 7 7 — 3 8 4 .
[ 4 ] DEMETROVICS, J . , On the equivalence of candidate keys with Sperner systems. Acta Cybernetica
4 (1979) 247—252.
[5] DEMETROVICS, J., Relációs adatmodell logikai és strukturális vizsgálata. MTA SZTAKI Tanul-
mányok, Budapest, 114 (1980) 1—97.
[6] DEMETROVICS J., FÜREDI Z., KATONA G. O. H., Minimum matrix representation of closure ope-
rations. Preprint of the mathematical institute of the Hungarian academy of sciences, Budapest,
12 (1983) 1—22.
[7] DEMETROVICS J . , FÜREDI Z . , KATONA G . , A függőségek és az individumok száma közötti kapcsolat
összetett adatrendszerek esetén. Alkalmazott Matematikai Lapok 9 (1983) 13—21.